RSS: Articles| Comments| Trackbacks
 

Syntactically Awesome StyleSheets 2

Posted by haakon, Mon, 25 Jun 2007 23:53:00 GMT

Have you heard of them? In their own words, Sass is:

a meta-language on top of CSS that‘s used to describe the style of a document cleanly and structurally, with more power than flat CSS allows. Sass both provides a simpler, more elegant syntax for CSS and implements various features that are useful for creating manageable stylesheets.

It has cool features like:

  • Better formatting for complexly nested selectors
  • Math support (use simple mathematical expressions)
  • Constants – define a color in one place and use it throughout your file. Want to change it later? Change it in one place only!

To learn more, check out the Sass Documentation. Sounds nifty, eh? I just came across Sass last week and have since change my css files over and it is a definite improvement. Less typing, more readable, better in every way. The only downside is that my editor didn’t give me nice color syntax highlighting for my .sass files.

I’ve just started using the NetBeans 6 editor which has some very nice ruby support. Here is a movie showing off some of it’s features. Unfortunately, it currently does not have support for sass files. So, I took a little bit of time and created a NetBeans module to provide syntax highlighting for Sass files. The beauty of syntax highlighting is that in many ways it works like a compiler, letting you know of errors in your code earlier rather than later. While working with Sass I found it a little annoying to make a mistake in my Sass file and not see the problem until I actually refreshed the page in a browser. Color highlighting can make your mistakes more obvious and findable. For example, when I was converting my css files to sass files, I would often make an error like follows:

a
  :font-size: 10px

I would put the colon to the front of the property name, but forget to remove the colon at the end. Color highlighting can highlight that as you are working on the file making it easier to catch errors. At the moment this is very much in a beta state, but please feel welcome to download and install the module. Currently it only supports sass property names with the colon at the front, rather than at the end.

[Update]

I’ve since found a more recent plugin that someone else is doing a fine job of maintaining. Download the plugin by Dylan Bruzenak

Enjoy!

~haakon

Final methods in ruby (prevent method override)

Posted by haakon, Fri, 06 Oct 2006 14:10:00 GMT

In a reversal of fortune, I recently found myself wishing Ruby was more like Java. Java has the ideas of abstract base classes and final methods (methods which should not be overridden in child classes). Such ideas don’t really exist in Ruby.

My problem was this. I had a base class Cronjob which represented some job which was going to run under cron. This class managed stuff like setting up logging, db connections, etc. I then wanted other jobs to be able to extend Cronjob and take advantage of the base class:

class Cronjob
  def initialize
    # do useful stuff here, setting up db connections, logging, etc.
  end

  def run   # method which no child class should override
    start = Time.now
    puts "starting at #{start}"
    run_job    # method which child class should override
    stop = Time.now
    puts "finished at #{stop}, took #{stop-start} seconds"
  end
end
This then allowed me to write a child class that did the real work:
class MyJob < Cronjob
  def run_job
    # real work goes here
  end
end 
And then call:
job = MyJob.new
job.run

All well and good. If people use the base class correctly, they get some nice bits of functionality. However, I eventually noticed that someone had written this class:

class TheirJob < Cronjob
  def run
    # real work goes here
  end
end 
This is bad! The author thinks they are taking full advantage of the base class, but they are not. In reality they are overriding the run method, and Ruby does not complain a bit. This is where if I were in Java I could use the final keyword to say that a method should not be overriden by any child classes. What I wanted was to be able to write:
class Cronjob
  final :run

  def run
  end
end
So, how can we make this work? Here is my solution that does the job.
class Object
  @@final_methods = {}

  class < < self
    def prevent_override?(method_name)
      @@final_methods.each do |class_name, final_methods|
        ancestors = self.ancestors
        ancestors.shift # remove myself from the list
        if ancestors.include?(class_name) and
           final_methods.include?(method_name)
          raise "Child class '#{self}' should not override parent class method '#{class_name}.#{method_name}'."
        end
      end
    end

    def method_added(method_name)
      prevent_override?(method_name)
    end

    def final(*names)
      @@final_methods[self] = names
    end

  end
end 
Now if someone tries to override the method in a child class they get an exception:
in `prevent_override?': Child class 'TheirJob' should not
override parent class method 'Cronjob.run'.(RuntimeError)

How does it work? The magic is possible because Ruby has a method called method_added. This gets called when a method is added to a class. So, when a source code file is being processed, if a method is defined with “def foo”, after the method has been added to the class this method_added method gets fired with “foo” as the argument. We can then implement the method with our desired behavior. In my case, I just wanted to blow up with an exception which is easy enough to do.

Ruby also has a method to get an object’s “ancestors”.
>> true.class.ancestors
=> [TrueClass, Object, Kernel]
>> [].class.ancestors
=> [Array, Enumerable, Object, Kernel]

So, the logic becomes simple. The final method just stores a hash of class => [methods which you cannot override]. Then, on method_added we do a check to see if the method being added is in this hash, and Bob’s your uncle!

So, while it was mildly surprising to find Ruby missing a language feature that I wanted, the language is powerful enough that you can “add to” the language! I’m also half expecting people to weigh in with suggestions of a better way to do this. I would be pleased to hear better variations. This solution definitely doesn’t make it impossible to override the method in a child class; a determined person could get around it. But it does solve my problem of someone inadvertently overriding the method.

Update: Now available via gems, thanks to Dr. Nic’s newgem magic:

gem install finalizer

Java Permgen space, String.intern, XML parsing 10

Posted by haakon, Sat, 09 Sep 2006 08:16:00 GMT

This week I have been poking through the innards of a web application trying to find out why we were leaking memory (in the permanent generation) like crazy. After a bit of digging I isolated it down to a line that looked like this:

Document doc = SAXParser.new().parse( stringContainingXML );

My first inclination was to blame the parser. Everyone knows that XML parsers are troublemakers, right? But, in the end I had to conclude the leak was entirely the fault of our code. But I learned a bit along the way! The details:

Permgen space – what is it?

The memory a jvm uses is split up into three “generations”: young (eden), tenured, and permanent. This is done to improve the performance of garbage collection. Most objects are short lived (local variables, etc), and so they come and go in the young generation. Some objects (like things in caches) stick around for a while and get promoted from the young to the tenured generation. Some things live “forever”, like the classes themselves, and “interned” strings. These go straight into the permanent generation.

Most memory leaks involve normal objects, and you run out of heap space by filling up the young and tenured memory spaces. Sometimes though, you might see “java.lang.OutOfMemoryError: PermGen space failure”. The most common cause is that you simply don’t have enough space to load up all your classes. Use the param ‘-XX:MaxPermSize=100m’ to adjust to a desired value. You may also find that doing a hot deploy of a war into tomcat eventually uses up permgen space. That is a different issue which I won’t discuss here.

If you observe that your app is leaking permgen space just while it is running (and not because you are hot deploying a war), then you have an interesting problem. The issue is most likely to be either an errant ClassLoader, or String.intern gone awry. ClassLoaders are an interesting beast, but our problem was with interned strings.

What is String.intern?

String.intern is an optimization feature. Doing a double equals (==) compare of two strings is a common mistake people make, as they forget that this is doing an identity comparison. (a == b) is checking if a and b are in fact the same object. Usually, what you really want to do is check if (a.equals(b)). This does the character by character comparison that you probably want.

The thing is, the latter comparison is much slower than an identity comparison. So, a nice performance optimization can be to maintain a canonical list of strings that allow you to do the fast identity comparisons instead. It would be easy enough to write such a thing for yourself, but it is included in Java these days with the String.intern method . So Java maintains a pool of these “canonical” strings to allow you to get some better performance when dealing with strings. But, this pool lives in the permgen space!

Why not intern all strings?

A natural question might be why one shouldn’t just intern every string. Well, there are two reasons why this wouldn’t work. One, you have finite memory. If you stored every string you ever saw into permgen space with intern, you would run out of memory reasonably quickly. Secondly, the reason you are using intern in the first place is as a performance optimization. It happens to be faster to retrieve the canonical string from the intern string pool than it is to do a character by character string comparison. However, as the intern string pool grows infinitely large, the cost to find your string in the pool would probably eventually become more expensive than to just do the character comparison. So, you only want to intern strings which you use frequently throughout the life of your app.

XML parsers seem to use String.intern (or something similar)

XML parsing just happens to be a whole lot of string parsing. So, it is not surprising to find that they take advantage if intern. But, we just said that you probably don’t want to intern every string you see, so what does a parser like Xerces intern? According to (http://xerces.apache.org/xerces2-j/features.html), “All element names, prefixes, attribute names, namespace URIs, and local names are internalized using the java.lang.String#intern(String):String method”. These are all the strings that are going to be seen repeatedly when parsing multiple xml documents with the same DTD. Notice, that they don’t intern attribute values, and tag contents. These elements are what change from document to document; they are your actual data. To intern these would be to intern your entire data space, and we would be facing the previously mentioned problem of effectively interning all strings.

Our problem

At last we arrive at our problem. We were parsing XML documents and finding that our permgen was steadily growing. At first we just enlarged permgen, assuming we had a lot of classes to load. But when we were blowing up with 500 megs of permgen space used up, it was time to find the problem.

After a bunch of digging, what we found was this. The XML we were parsing was not really XML. It was well formed (tags opened and closed properly, nested properly, etc). But, it was XML for which it would be impossible to write a DTD because the data lived in the tag space. An example will show it best. We had tags that looked like:

<data.6541237895.field1>field one val</data.6541237895.field1>
<data.6541237895.field2>field two val</data.6541237895.field2>
<data.7813329781.field1>field one val</data.7813329781.field1>
<data.7813329781.field2>field two val</data.7813329781.field2>
...

The numbers inside of the tag itself was data! So, there was no limited, finite number of tags that could exist in an XML document of this form. Rather, you could have as many tags as could be represented by a ten digit number. To make it worse, there were different values like “foobar” and “name” and many others for each number. The details are boring, but the important bit was that our tag space was as big as our data space. The XML parser was merrily interning every tag string it saw as a reasonable performance optimization. But, as our XML was not true XML, everything came crashing down.

So how to fix it?

  1. Maybe the best solution would be to fix the bad XML. In this case, we were not the source of the XML so this ideal option was not practical.
  2. Supposedly one can turn off the “feature” of interning via the SAX parser interface in Java. In practice, none of the parsers we tried allowed us to turn it off (let me know if you find one that does!).
  3. It would be nice if the interned strings could just be garbage collected like any other Java memory. I’ve seen conflicting reports on this. This article appears to show that interned strings can be collected.
  4. Don’t use an XML parser if you aren’t really parsing XML.

Number 4 may seem like a copout, but it is the option we landed on. We now use a few regular expressions to pull the data we need from the “XML” document. This happens to both fix our memory problem, and result in a performance improvent. Apparently selecting just the parts of the document we need with a regex is faster than parsing the whole thing with an XML parser.

How to find these problems

Tracking down these problems can be challenging:

  1. Profiling is your friend. Find a good profiler and learn how to use it (JProfiler works nicely).
  2. jmap and jstat are useful tools that come with the jdk. They give you info about memory usage, etc.
  3. visualgc (jvmstat) is a nice tool for seeing an overall picture of your memory usage.
  4. understand how garbage collection works
    (http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html)
  5. get familiar with jvm args that help with this kind of debugging and performance optimizations. Verbose gc logging, tracing of class loading, etc.
    (http://java.sun.com/docs/hotspot/gc1.4.2/faq.html, http://www.brokenbuild.com/blog/2006/08/04/java-jvm-gc-permgen-and-memory-options/)

~haakon

Ruby Retries 2

Posted by haakon, Fri, 16 Jun 2006 16:20:00 GMT

Benjamin Franklin said, “The definition of insanity is doing the same thing over and over and expecting different results. ” Sometimes, though, a retry is exactly what you need. You may need to call something that is not reliable. Maybe it is a network connection that might be down temporarily. How often have you built retry logic around a bit of code that just might fail on occasion? This snippet takes care of that situation nicely.


class Fixnum
  def tries(message)
    current_try_num = 1
    begin
      yield current_try_num
    rescue => e
      if current_try_num >= self
        raise
      else
        puts "Try #{current_try_num} failed (#{message}): #{e}"
        current_try_num = current_try_num.next
        retry
      end
    end
  end
end

10.tries('doing work') do
  raise 'network connection failed' if rand() < 0.7
  puts 'success'
end

output:
>ruby tries.rb
Try 1 failed (doing work): network connection failed
Try 2 failed (doing work): network connection failed
success

Update! 2007-06-01

This is now available as a ruby gem! To install:

sudo gem install retry

http://retry.rubyforge.org