Wednesday, May 20, 2009

Usecases using git log

I did a lightning talk (< 10 min) about 'git log' @ Montreal.rb yesterday. My sources were Scott Chacon's git log screencast, Nick Quaranto's gitready post, as well as git-log manpages. Here are my notes:

Usecase: I want to be up to date with the latest activity

git log
git log -4
git log --since="1 day"
git log --since="2 hour"
git log -2 --stat
git log -2 --name-status
git log --since='2009-05-05' --until='3 days'
git log i-2 --author='Martin C'
git log 9485fdd2..7b766227 # optionally specify a file



Usecase: I want the latest changes on a specific file

git log [file] # gives you all commits for that specific file
git log -3 [file] # gives you the 3 last commits
git log -p [file] # gives you all diffs for every commit of that file
git log -2 -p db/schema.rb
git blame [file] # gives SHA1, author, line changed, app/controllers/questions_controller.rb



Usecase: I want the changes of a commit you partly remember the comment

git log --pretty=oneline | grep [comment
git log --grep='^user icon'
git log --grep='user icon' -i
git show [SHA1]
git show [SHA1]:[file


Usecase: I want stats on commits

git log --pretty=oneline | wc -l
git log --pretty=oneline --no-merges | wc -l  # no-merges: do not show commits that have more than 1 parent
git log --pretty=oneline --author='Martin C' | wc -l
git log --pretty=format:'%an' | sort | uniq -c | sort -n
git log --pretty=format:'%h by %an'
git log -4 --pretty=format:'%h (%H) by %an %ar (%ad) %s'
git log -4 --pretty=format:'%h by %an %ar (%ad) %s' db/schema.rb

Saturday, May 9, 2009

RailsConf 2009: what to remember

I attended RailsConf 2009 and I'm quite happy about it. With close to 2 years into the Rails world, I wish some of the presentations were for a more advanced audience. The most interesting part was in the panels and keynotes.

Note: a few presentations are available through blip.tv. All in all, and somewhat briefly, what to remember from RailsConf 2009? If I had to choose 2 words to summarize it, I'd pick productivity and Rack. If you need more details, keep on reading!


Keynote by DHH

Rails has been there for 5 years (2004-2009)
Rails 3 will have:
  • new routing
  • XSS
  • unobstrusive and agnostic JS
  • More agnosticism (e.g. ORM-agnostic)
The real secret to high productivity comes from "renegotiating the requirements", i.e. make sure the said requirements fulfill the need in its simplest form (which is very often different from the way it was originally expressed).


Keynote with DHH and Tim Ferris

Although a lot of people didn't enjoy Tim Ferris' keynote (which was anything but terse, and seemingly too far beyond software engineering), there were nevertheless some points which were interesting.

  • Set rules for one's self
Tim Ferris explained that having very little time to do something forced him to set rules so that he could focus on the most important things. This resonates with Eric Evans' conviction on identifying and subsequently focusing on the core business value. Or the idea that 80% of your time should be spent where your business has most value, and not only 20% of your time (as it is often the case).

He also talked about 'lifestyle design' as opposed to 'deferred lifeplan' (having retirement as one of its assumption). I thought that this resonated with the concept of 'lifestyle business'.

  • convention over configuration
And here is a parallel with a core concept in Rails:

The idea of convention over configuration is a very appealing one because you want to have rules and rituals for everything that does not involve your core competencies, so that you can use your creative resources where they are most valuable.

  • Information diet
The idea is to develop 'selected ignorance'. In our digital world, there is more accessible information that it is possible to digest. So we need to create barriers or filters in order to select.

"If you look at how media/news is generated, the less you believe it's necessary to read it, partially because it's biased or inaccurate".

  • Path to success?
Through all conversations he had with successful people in technology or in other domains, he consistently heard one piece of advice:

There is no one path to success or productivity. It's very personal. But there is one path to be completely miserable [or failure], and that is trying to please everybody"


Keynote by Chris Wanstrath

  • Productivity
This idea of friction, it's a lot like the inverse of productivity. Being productive means getting things done efficiently and effectively. Friction keeps you from doing those things, slows you down, conspires against you, wastes energy.
  • Examples of productivity
Here are two of the most important investments you can make when starting a new business: having professionals file the paperwork and handle the numbers.
We all love Rails because it makes most of the tedious stuff go away.
And testing is famous because, well, bugs and bad design are the worst.
  • Focus on the community
So let's follow these examples. Let's create more projects that scratch an itch or ease some pain. Let's stop obsessing about which test framework to use and start obsessing about building sites that solve problems. Let's stop arguing about languages and continue improving our favorite ones. Let's stop blogging lengthy tutorials to get RSS subscribers and start contributing to official documentation efforts.

Keynote message: Let's focus more on code and less on talk. More on the community and less on ourselves.


Keynote by Robert Martin

If Tim Ferris' keynote was judged as the least appreciated keynote, uncle Bob's keynote was certainly the most appreciated. It was hailed as both entertaining and interesting. His keynote title was: "What killed SmallTalk and might kill Ruby?" Here are his answers:
  • lack of discipline (specifically: using proper TDD)
  • arrogance, lack of humility ('us' vs 'them')
  • lack of acceptance to solve the dirty problems (those of the enterprise)
In addition to that, some catchy phrases:
  • SmallTalk epitomized what OO was about
  • SmallTalk was the most powerful language in the 70s and 80s
  • what is clean [code] today might not be clean [code] tomorrow
  • Tests eliminate fear
  • professionalism is the disciplined wielding of power

Rack is the new cool kid

Several talks during RailsConf were about Rack (which really is a very simple way to plug functionality before or/and after the handling of a web request), people were talking about it. There certainly was a buzz about Rack.

Personally, I think Rack middleware is very similar to a ServletFilter in the filter chain, in the Java world. If you haven't played with Rack yet, the best introduction to Rack I could recommend is Dan Webb's 8-min presentation.


Rails innovations

Greg Pollack and Jason Seifer (RailsEnvy guys) gave a presentation about the Rails innovations in the last year. Here are the pieces of software worth mentioning:
  • RubyMine IDE
  • Rack
  • Rails metal
  • Rails templates
  • metric-fu gem
  • Cucumber for BDD
  • FakeWeb
  • Spike (log analyser)
  • Ultrasphinx (full-text search)

Thursday, May 7, 2009

Agile development tips, by David W. Frank

I watched the talk from David W. Frank (from Pivotal Labs) at RailsConf. It was interesting to listen to advice on agile software development, although on some I would disagree. Here are my highlights and some thoughts afterwards.


Stick to conventions
  • follow the ground rules
  • be a lord of discipline
  • pop goes the story stack (pick the very first task on top)

Stay in sync
  • cubicules kill
  • go to lunch as a team
  • keep up the jibber jabber (talking while working)

Tools matter

  • rtfw (or be quick to google it)
  • get stories done
  • this is a whiteboard
  • don't forget the hardware

Pair
  • live together, die alone
  • let the rookie type
  • pairs stink after 2 days
  • two pairs are better than one

Test drive
  • beat the blews
  • tests, not comments
  • ping pong isn't just table tennis

Check your dashboard
  • every failing test is sacred
  • fix first, ask questions later
  • slow suites are project heart disease

Work simply
  • architecture is a 4-letter word ('test')
  • make it green, then clean it
  • check-in early and often
  • kill dead code dead

My thoughts
  • pop goes the story stack
Code ownership is generally a bad idea, but in reality you have deadlines, i.e time is a constraint. If you need to be effective, some tasks will be done much quickly if done by certain team members (it might not be yourself). You shouldn't refuse a task because it's a boring chore (pleonasm intended), you should refuse a task if you feel it would take you substantially more time than if it would be done by another team member.

  • go to lunch as a team
But not all the time. Maybe 50% of the time? Lunch is a very important break during the day, and it can be relaxing (sometimes needed) to have lunch with some people outside of your team.

  • keep up the jibber-jabber
Nah. Totally not. Go for open spaces, but no to the jibber-jabber. You need to interrupt another team member when you really can't deal with the issue yourself. That's why a chatting channel (e.g. campfire) is useful. People will go read it whenever they feel like it, instead of being interrupted in the middle of something.

  • tests, not comments
Absolutely, yeah. Comments are code pollution, very often. If you feel you need to write a comment, it probably means you need to refactor or rename method/class names.

Friday, March 27, 2009

Ruby quiz (easy) #10

Question: How do you make this line a one-liner?



value = compute_value

value = (value > CONSTANT) ? value : CONSTANT




Solution:


value = [compute_value, CONSTANT].max




Wednesday, March 25, 2009

Unobstrusive metaprogramming, by Sean O'Halpin

I watched Sean O'Halpin 30-minute Ruby Manor presentation (Nov 2008) called "Unobstrusive metaprogramming". He gave a few tricks about monkeypatching.



Here are my highlights.


Problems with monkey patching "the Rails way"
  • namespace pollution
  • documentation pollution 
  • cognitive pollution 

Case in point:



>> [Kernel, Object, Class, Module].inject({}){|hash, klass| hash[klass] = klass.methods.size; hash}

=> {Object=>98, Kernel=>159, Module=>99, Class=>99}

>> require 'active_support'

=> true

>> [Kernel, Object, Class, Module].inject({}){|hash, klass| hash[klass] = klass.methods.size; hash}

=> {Object=>187, Kernel=>225, Module=>188, Class=>188}




Recommendations
  • indirect access (using functions)
  • namespaces
  • hide your mess
So for instance, instead of monkey patching the Array object, we can do something like that:


a = [[1,2,3,[4],5]]

Doodle::Utils.flatten_with_1_level(a) # => [1,2,3,[4],5]



The plea

Don't monkeypatch: Object, Kernel, Module, Class, Hash, Array, etc.


Everytime you monkeypatch Object, a kitten dies.


Thursday, March 12, 2009

Hashtweeps scraper: mixed feelings about Scrubyt

Motivation

Hashtweeps is a simple site where you can search all tweets with a specific hash term. I have decided to give Scrubyt a go and write a little Hashtweeps scraper to get a feel of it. The CEO of the company where I am at was at the leadscon conference, so I was challenged to gather all tweets with the leadscon hash (#leadscon).


The code


require 'rubygems'

require 'scrubyt'



HASH_TERM = "leadscon"



data = Scrubyt::Extractor.define do

  fetch 'http://www.hashtweeps.com/'

  fill_textfield "term", HASH_TERM

  submit



  # build XML structure

  item "//li[@class='result']" do

    msg "//div[@class='msg']"

    link "//div[@class='info']/a[1]/@href"

    user "//div[@class='info']/a[1]", :generalize => false do

      # this will follow link because it ends with "_detail"

      page_detail do 

        profile_info '//ul[@class="about vcard entry-author"]' do

          full_name "//li//span[@class='fn']"

          location "//li//span[@class='adr']"

          website "//li//a[@class='url']/@href"

          bio "//li//span[@class='bio']"

        end

      end

    end

  end

end



# dump XML to file

dump = File.new("output.xml", "w")

dump.puts data.to_xml




Screenshot of my output

I added a very simple XSLT to my XML output and here it is:



Scrubyt: Pros
  • All in one: navigator, extractor, output builder. With very few lines of code, you can write a simple scraper which can navigate pages, scrape and build/output a custom XML structure.


Scrubyt: Cons
  • Lack of a good API reference. I had issues with the official one. How am I supposed to know that if you end your method with "_detail" it will navigate to that page? It seems hard to ge beyond the "scrape Google results"-type of scenario. Indeed, there's lots of TODO's in the reference. Hopefully this reference will get more structure and coverage.
  • No debug info. Even though the code above is fairly simple, when it breaks, you have no idea what the error was: Scrubyt just exits.
  • Impossible to test. How do you make sure the second page you navigate still exists? How do you make sure the HTML elements are still where you thought they were? How do you make sure your code constructs the XML structure as you want it to be?
  • The dependencies are not harnessed. I am a fan of having tight control on dependencies' versions. I wasted time to figure out that Scrubyt could not run with the (latest) version of the Mechanize gem I had installed.
  • The unofficial doc is outdated. Tutorials are nice when you want to get a first feel of a new tool. Unfortunately, probably because the Scrubyt source code has been changing a lot in the last year, Scrubyt tutorials out there are no more accurate. Remedy: the first stop to kick start your first scraper should be the tutorials on Scrubyt's wiki

Conclusion

If you're serious about scraping, scrubyt is not a viable option. As soon as you're beyond the trivial scraper (like the one I did), you're in for some waste of your time.

Thursday, March 5, 2009

Ruby VM's, by Jason Seifer

I watched Jason Seifer's QCon 30-min presentation on Ruby VM's. He describes the main characteristics of most Ruby VM's out there. A little note: his criterium to decide whether a VM is production-ready or not is based on its support (or lack of support) for Rails, since that's a Ruby framework which touches on a lot of aspects of the Ruby implementation.



Here are my highlights.


Ruby versions
  • Ruby 1.8.6 is the most widespread implementation
  • Ruby 1.8.7 is the current stable version of Ruby (middle ground between 1.8.6 and 1.9)
  • all VM's focus on Ruby 1.8.6 (with the exception of JRuby, which has support for both Ruby 1.8.6 and Ruby 1.9)

MRI
  • the de facto standard
  • the only concise Ruby spec out there
  • supported by all major platforms
  • production-ready

YARV
  • Ruby 1.9.1 will have YARV as the default VM
  • not production-ready

MacRuby
  • Ruby 1.9 VM on OS X core technologies
  • mostly uses Objective-C and YARV C code
  • open source
  • early in development (< 1.0 release)
  • not production-ready

XRuby
  • first Ruby to Java compiler
  • last release: 0.3 in Nov 2007
  • not production-ready

MagLev
  • not released yet
  • made by GemStone
  • uses OODB (allow Ruby objects persistence)
  • not production-ready

Rubinius
  • "Ruby in Ruby" (since part of it is written in Ruby)
  • the VM proper is written in C++
  • the standard lib written in Ruby
  • Ruby objects mapped to a C++ object
  • migration from C to C++ was for lowering the barrier of entry for project contribution
  • slow
  • best of the MRI contenders
  • Ruby spec project came out of Rubinius
  • not production-ready

IronRuby
  • Microsoft's implementation of Ruby
  • Ruby on .Net
  • released under MPL (Microsoft Public Library)
  • runs on the DLR (Dynamic Language Runtime), on top of the CLR
  • DLR supported languages: IronPython, IronRuby, Javascript, Dynamic
  • not production-ready

JRuby
  • big advantage: can use existing Java code
  • most performant Ruby implementation
  • multi-threaded (native ones, i.e. it threads in different processes)
  • introduced a compatibiltiy land (can start the VM with a Ruby 1.8 or 1.9 "compatibility flag")
  • The solution for the enterprise-class Ruby applications
  • production-ready (!)