japh(r) by Chris Strom: January 2009

Monday, January 26, 2009

Pumpking

The position of Pumpking is a venerable one in the Perl community. The Pumpking, speaking with as much authority as Larry himself, was responsible for major releases of Perl.

I very much appreciated the concept back in my JAPH days. Shortly after starting as head of software engineering (for a Rails shop), I introduced the role of Iteration Pumpking.

The Iteration Pumpking

As Pumpking, one developer is given responsibility for the overall quality of the iteration. The Pumpking is expected to comment on any suspect commits (e.g. violations of coding standards, poor coding practices, missing tests). The Pumpking has complete authority to re-open tickets as they see fit. The Pumpking calls for code review where appropriate.

The most important result of the Pumpking role is that overall code quality has improved simply because everyone knows that all commits are being reviewed. It is human nature to try a little harder when you know that others will be reviewing your work.

Another important aspect of the role is that any developer can (and is expected to) serve as Pumpking. Each Pumpking brings a different point of view. A change in perspective always helps the code and the individuals being reviewed.

Serving as Pumpking is especially beneficial in improving the skills of weaker coders. An iteration might suffer for keen reviews with a weak Pumpking, but not nearly as much as having an NNPP actively committing code changes. Weak Pumpkings are forced to step back from their normal practices and examine, with a critical eye, the work of others.

The Pumpking role evolved rather nicely. As is always best when trying something like this, the initial scope of the role was left vague. With each subsequent Pumpking, the role grew more defined. At some point, someone even developed tools to facilitate the role. The team, one at a time, grew the role to best meet its own needs.

I do wonder about trying this with teams of weaker developers. To be sure, there were some weak team members in my group, but the team as a whole skewed toward the proficient portion of the expertise spectrum. Something to consider in the future.

Saturday, January 24, 2009

Render, Dammit

I choose to believe that this little anti-pattern is not unique to my company...

Please Do Not Do This At Home or ANYWHERE ELSE!

Consider, if you will, a form. You may imagine a simple little form. We tend to prefer big, honking ones in our ENTERPRISE CLASS APPLICATION. Any old form will do for purposes of this discussion.

The form submits to a controller action (e.g. create or update). Should the action recognize an error, we note the error in the session flash, then redirect back to the submitting form.

Not only do we do this (a lot), but we have built up several conventions around it. We have a :messages_top and :messages_bottom that put the warning at the top or bottom of the form. We even have many, many ways to store the submitted params in the session so that the form can show the submitted values when displaying the error message.

We have spent so much time doing this that it actually works. Well, most of the time. Except when the user clicks on an FAQ link, then the back button and all the data (in the session) is lost. Or when the Rails full error messages do not read quite right (why didn't the Rails people make messages read better?).

The correct way of handling this, of course, is to have the action not redirect when an error occurs or a validation fails. Instead, the action should re-render the submitting form template. The action already has the error messages, the submitted form parameters and can easily build up any lists that are needed in the form.

How on Earth did this Happen?

Quite simply, this is the work of one or more NNPPs.

One instance of this can be forgiven. If you are new to the framework and web development and you are under the pressure of a deadline, it happens. But to not go back and question if there is a better way, to never come across a better way in reading a book or even the scaffolding, is unacceptable.

Ultimately, the team needs to tell the developer, "please stop, you're hurting us". If the NNPP fails to grow into a NPPP, the developer needs to go.

Thursday, January 22, 2009

Why Twitter Matters

I recently got religion, so take this with a grain of salt, but...

Twitter matters.

It is part of my exocortex. It is one of the tools that I use to process information outside of my brain.

Thanks in part to my newly found religion, Twitter is not the only extension of my neocortex that I use. I use Emacs org-mode. I have a Moleskine. I store things in delicious. So no, Twitter is not the entirety of my exocortex, but it is the most important part...

What makes it so important is that it the part of my exocortex that I share with others. I influence others. Others influence me. All in 140 characters. No wasting anyone's time (much). Little risk of my time being wasted.

It's that interaction that makes it so important. Because every now and then, you get something that inspires you to achieve. Or you inspire some else. Or maybe, just maybe, you get this.

Wednesday, January 21, 2009

Announce: Conventionally Scoped Names Gem

I've posted a conventionally_scoped_names gem to github.

More info:

Named Scopes in Views (Part 1)

Named Scopes in Views (Part 2)

Why?

My main motivation for releasing the code in gem format was for the experience (never having done it before). And who knows? It might prove useful to someone.

Two things I learned:

it was not quite as easy as I had expected

artificial constraints really slow things down

Point #1 is an important one. I had expected obstacles, just not some of the obstacles that I met. Honestly, that's not 100% accurate. I expected obstacles, but envisioned being able to blow through them with ease. I made it through, but not with the ease I would have liked.

As for point #2, I can sum it up like this: sometimes I'm obstinate to the point of dumb-assitude.

I normally use RSpec for testing and follow TATFT. For this gem, I decided that I needed to follow the Rails testing convention more closely. At the same time (and this is where the obstinate bit comes in) I refuse to use Shoulda, Context, and Mocha, because, really, what do they give you that RSpec does not?

Thus, my artificial constraints were that I wanted to write tests for a controller/model plugin without being able to mock easily. That meant fixtures, which are not my bag, baby.

My biggest problems were with the fixture datetimes. For whatever reason, I could not get date comparisons to work properly with the in-memory sqlite3 DB. Eventually, I opted to use string dates instead of datetimes (it did not affect what was being tested one way or the other), but I spent a lot of time trying to get it working.

I could draw the lesson from this experience that if I do not impose artificial constraints, I will get done much faster. But you know what? There are always constraints—artificial or otherwise. To be a better coder, to improve at my craft, I need to get better at meeting and overcoming constraints.

So what I choose to take from this experience is that I need to keep up the reps.

Monday, January 12, 2009

Named Scopes in Views (Part 2)

In part 1, we explored the use of named scopes in views (more precisely named scoped names). The named scopes that we used were relatively simple in that they took no arguments.

But what if we have a requirement to filter blog posts by substrings in the title? The named scope would be something of the form:

named_scope :with_title, lambda { |*args| {:conditions => ["title like ?", '%'+args[0]+'%']} }

This makes our entire post class look like this:

class Post < ActiveRecord::Base
  named_scope :published, :conditions => {:status => 'published'}
  named_scope :draft,     :conditions => {:status => 'draft'}
  named_scope :archived,  :conditions => {:status => 'archived'}

  named_scope :published_or_draft, :conditions => ["status in ('published', 'archived')"]

  named_scope :last_month, lambda { {:conditions => ["updated_at > ?", 1.month.ago]} }
  named_scope :last_year,  lambda { {:conditions => ["updated_at > ?", 1.year.ago]} }
  named_scope :this_month, lambda { {:conditions => ["updated_at > ?", Time.now.beginning_of_month]}}

  named_scope :sort_by_title,   :order => "title DESC"
  named_scope :sort_by_updated, :order => "updated_at DESC"

  named_scope :with_title, lambda { |*args| {:conditions => ["title like ?", '%'+args[0]+'%']} }

  def published?; status == "published" end
  def draft?;     status == "draft"     end
  def archived?;  status == "archived"  end
end

Recall from part 1 that we passed all our scopes into the controller in the form:

{:scopes => ["published", "sort_by_title"]}

Ideally, I would like to introduce parameterized named scopes to the controller in the form:

{:scopes => [ ["published"],
              ["sort_by_title"],
              ["with_title", "foo"] ]}

If such a data structure were easily built in forms, then we could almost use the controller code from part 1 without change. Sadly, that is not quite possible.

As a workaround, I choose the convention of passing in named scopes with parameters as hashes.

<div id="filter-by-title-substring">
 <input type="text" name="scopes[][with_title]"/>
</div>

This will give us parameters of the form:

{:scopes => [ "published",
              "sort_by_title",
              {"with_title" => "foo"} ]}

Since we are already adding complexity, why not generalize a bit as well? The following code will handle the list-of-strings-and-hashes data structure, plus it will work for any class:

class PostsController < ApplicationController
  def index
    @posts = scopes_for(Post)
  end

  private
  def scopes_for(klass)
    scopes_from_params.inject(klass){|proxy, scope| proxy.send(*scope)}
  end

  def scopes_from_params
    returning scopes = [] do
      (params[:scopes] || []).reject(&:blank?).each do |scope|
        case scope.class.to_s
        when "String"
          scopes << [scope]    if klass.scopes.include?(scope.to_sym)
        when "Hash", "HashWithIndifferentAccess"
          msg, arg = scope.first # {:foo => 'bar'}.first => [:foo, "bar"]
          scopes << [msg, arg] if klass.scopes.include?(msg.to_sym) && !arg.blank?
        end
      end
    end

    scopes.blank? ? [:all] : scopes
  end
end

The scopes_for() private method handles the proxy scope injection that was done entirely in the index action from part 1.

The actual scope calculation is now done in the scopes_from_params method. It still checks the supplied scopes to ensure that they are scopes and not arbitrary methods. It still returns the :all scope if no scopes are supplied. But it also handles our "Hash" case for invoking named scopes with arguments.

The last change is back in the scopes_for method, which now splats the scopes returned from scopes_from_params. If we access the action with the parameters:

{:scopes => [ "published",
              "sort_by_title",
              {"with_title" => "foo"} ]}

We will get the following back from scopes_from_params():

[ ["published"],
  ["sort_by_title"],
  ["with_title", "foo"] ]

Injecting these into Post in scopes_for() will result in Post.send("published").send("sort_by_title").send("with_title", "foo"), which is equivalent to Post.published.sort_by_title.with_title("foo)—exactly what we want.

We now have all the benefits from part 1 (simple views, simple controllers, no magic number coupling between view & controller to handly combo cases like showing published and drafted posts at the same time), plus the ability to use named scopes that take arguments.

Saturday, January 10, 2009

Named Scopes in Views (Part 1)

Names scopes are awesome. What makes them so awesome is that named scopes are named. They have a label that means something in the current domain.

Granted the Model is Model and View is View and never the twain shall meet. Models should never intrude into the View and this includes named scopes. But...

This does not mean that scope names should stay out of views—they have just as much meaning in the view as they do in the model. If we can exploit the convention of using the same names, so much the better.

Consider a post class with some sweet named scopes:

class Post < ActiveRecord::Base
  named_scope :published, :conditions => {:status => 'published'}
  named_scope :draft,     :conditions => {:status => 'draft'}
  named_scope :archived,  :conditions => {:status => 'archived'}

  named_scope :published_or_draft, :conditions => ["status in ('published', 'archived')"]

  named_scope :last_month, lambda { {:conditions => ["updated_at > ?", 1.month.ago]} }
  named_scope :last_year,  lambda { {:conditions => ["updated_at > ?", 1.year.ago]} }
  named_scope :this_month, lambda { {:conditions => ["updated_at > ?", Time.now.beginning_of_month]}}

  named_scope :sort_by_title,   :order => "title DESC"
  named_scope :sort_by_updated, :order => "updated_at DESC"

  def published?; status == "published" end
  def draft?;     status == "draft"     end
  def archived?;  status == "archived"  end
end

Like I said, sweet. You can ask for all posts that are published (Post.published), drafted (Post.draft), or even published or drafted together (Post.published_or_draft). You can sort published posts by title (Post.published.by_title). You can even ask for all published posts from this month, sorted by title (Post.published.this_month.sort_by_title).

Ah, the power of named scopes.

It is easy to see how these would come in handy when listing posts (as in an index page). Once we have lots of posts, it will be quite convenient to filter & sort the list to find exact posts. Consider the following generated view code:

<form action="/posts" method="get">
<div id="filter-by-status">
 <select name="scopes[]">
   <option value="published_or_draft">Published or Draft</option>
   <option value="published">Published</option>
 </select>
</div>

<div id="filter-by-date">
 <select name="scopes[]">
   <option value="this_month">This Month</option>
   <option value="last_month">Last Month</option>
   <option value="last_year">Last Year</option>
 </select>
</div>

<div id="sort">
 <select name="scopes[]">
   <option value="sort_by_title">Title</option>
   <option value="sort_by_updated_at">Update Date</option>
 </select>
</div>
</form>

Nice, clean HTML with meaningful names. No magic numbers to represent showing published and draft posts at the same time. No need to worry about too many text field names—everything is in scopes[].

Even in the face of all this named scope goodness, the controller remains simple:

class PostsController < ApplicationController
  def index
    scopes = params[:scopes].reject(&:blank?)
    scopes = [:all] if scopes.blank?
    @posts = param_scopes.inject(Post){|proxy, scope| proxy.send(scope)}
  end
end

Simple, but perhaps warranting some explanation. First, we reject any blank scopes. If no scopes remain, then we are left with the default of :all posts.

The inject statement exploits the chaining nature of named scopes. To see this, consider the case in which scopes == [:published, :sort_by_title]. The first iteration through the inject would set the named scope proxy as the Post class itself. By sending it :published, we return a (published) named scope for assignment in the next iteration. In that next iteration, the :sort_by_title named scope is sent, returning a named scope chain equivalent to:

Post.published.sort_by_title

An astute reader might note that we risk an attacker sending arbitrary messages to our base class. Make no mistake, this is a grave risk. Fortunately, it is easy enough to prevent by selecting only the scopes that are known to the Post class:

class PostsController < ApplicationController
  def index
    # After rejecting blank scopes, map the remaining to symbols
    # so that they can be selected from the list of known Post scopes
    scopes = params[:scopes].reject(&:blank?).map(&:to_sym).select{|s| Post.scopes.include?(s)}
    scopes = [:all] if scopes.blank?
    @posts = param_scopes.inject(Post){|proxy, scope| proxy.send(scope)}
  end
end

The only scopes allowed are those known by the Post class as a scope (and not a method).

Scope injection attack thwarted!

We have added a lot of power to our posts controller's index action. Even with that power, we still have a very simple controller, a view with one parameter namespace for all sorting and filtering options that we like, and no magic number coupling between the view and controller. Best of all, we have a bunch of re-usable named scopes in the model.

In part 2, we will add a little bit of complexity to handle named scopes that take arguments.

Thursday, January 8, 2009

GNU Screen

I've gotten so set in certain ways, that my productivity takes a severe hit if tiny things are varied. One of those things in how I use GNU screen.

My windows:

Source code control and ancillary daemon manipulation (e.g. memcached).

Code interaction - mostly ack (--thppt); some configuration edits; almost no code edits (that's why God made emacs.

REPL / script/console - for whatever reason, I find this easier to do in the console than in Emacs

Interactive DB session

Logging - usually a tail -f on a log file or two

I am not sure how I settled on this particular configuration, but, as I said, I almost can't function without it.

Thursday, January 1, 2009

Clojure http-client.clj

I've been playing with Clojure recently. I've been following along in the Beta version of Programming Clojure, which uses r1097 of clojure for all examples. Being too much of a newbie, I did not want to stray.

The problem that I ran into was that the http-client.clj posted to the clojure group would not compile. As best I can tell, it seems that the clojure.core namespace used to be simply clojure. Search and replace quickly solved that.

The syntax for doseq also looks to have been updated -- square brackets are required, so:

        (doseq kw (keys (http-defaults :set-header))
           (. conn (setRequestProperty kw (get (http-defaults :set-header) kw))))

becomes:

        (doseq [kw (keys (http-defaults :set-header))]
           (. conn (setRequestProperty kw (get (http-defaults :set-header) kw))))

My working version:

;;  Copyright (c) Dirk Vleugels. All rights reserved.
;;  The use and distribution terms for this software are covered by the
;;  Common Public License 1.0 (http://opensource.org/licenses/cpl.php)
;;  which can be found in the file CPL.TXT at the root of this distribution.
;;  By using this software in any fashion, you are agreeing to be bound by
;;  the terms of this license.
;;  You must not remove this notice, or any other, from this software.
;;
;;  http-client.clj 0.1
;;
;;  Usage:
;;
;;  (http-client/url-do "https://gmail.com" "GET" { :read-timeout 5000
;;                                                  :set-header { "User-Agent" "Clojure-HttpClient/0.1
;;                                                                "X-Header-xxx" "This is a test" }
;;                                                  :connect-timeout 1000
;;                                                  :http-proxy "http://proxy.org:8080"
;;                                                  :follow-redirect false
;;                                                  :use-caches false } ))
;;  (http-client/url-do "https://gmail.com" "POST" {:read-timeout 5000
;;                                                  :connect-timeout 1000
;;                                                  :use-caches false }
;;                                                  "this is a post test"))
;;
;; Returns a map or (currently) throws a exception
;;
;; {:return-message "Found", :return-code 302, :header {nil [HTTP/1.1 302 Found], "Server" [GFE/1.3], "Content-Type" [text/html; charset=UTF-8], "Date" [Thu, 10 Jul 2008 19:31:52 GMT], "Location" [http://www.google.com], "Content-Length" [218]}, :body [B@bed1e7}
;;
;;  A clojure interface for the jdk http client (requires jdk 1.5)
;;  - ignores SSL cert or host verification problems (thats what i want 99% of the time)
;;
;; Todo:
;;  - exception handling + resource cleanup, logging?
;;
;;  dirk.vleugels (gmail)
;;  8 Juli 2008

;;(load-file "/Users/dvl/work/clojure/svn/clojure/trunk/src/genclass.clj")
(clojure.core/in-ns 'http-client)
(clojure.core/refer 'clojure.core)


(import '(java.net URL URLConnection InetSocketAddress Proxy Proxy$Type)
        '(java.io Reader InputStream InputStreamReader FileReader
                  BufferedReader File PrintWriter OutputStream ByteArrayOutputStream
                  OutputStreamWriter BufferedWriter Writer FileWriter)
        '(javax.net.ssl X509TrustManager HostnameVerifier SSLContext HttpsURLConnection)
        '(java.security.cert X509Certificate))

(def http-defaults {:read-timeout 10000
                    :connect-timeout 3000
                    :http-proxy ""
                    :set-header { "User-Agent" "Clojure-HttpClient/0.1" }
                    :use-caches false
                    :follow-redirects false})

;; provide dummy trust manager
(let [ clazz "httpclient.CljTrustManager" ]
  (try (. Class (forName (str clazz)))
       (catch java.lang.ClassNotFoundException e
         (gen-and-load-class (str clazz) :implements [javax.net.ssl.X509TrustManager]))))
(clojure.core/in-ns 'httpclient.CljTrustManager)
(clojure.core/refer 'clojure.core)
(defn checkClientTrusted [this chain auth-type])
(defn checkServerTrusted [this chain auth-type])
(defn getAcceptedIssuers [this ] nil)
(clojure.core/in-ns 'http-client)
;; end

;; provide hostname verifier
(let [ clazz "httpclient.CljHostnameVerifier" ]
  (try (. Class (forName (str clazz)))
       (catch java.lang.ClassNotFoundException e
         (gen-and-load-class (str clazz) :implements [javax.net.ssl.HostnameVerifier]))))
(clojure.core/in-ns 'httpclient.CljHostnameVerifier)
(clojure.core/refer 'clojure.core)
(defn verify [this hostname ssl-session] true)
(clojure.core/in-ns 'http-client)
;; end

(defn- create-proxy [proxy]
  (if (= proxy "")
    (. Proxy NO_PROXY)
    (let [u (new URL proxy)
          host (. u getHost)
          port (if (neg? (. u getPort)) 8080 (. u getPort))
          sa (new InetSocketAddress host port)
          prx (new Proxy (. Proxy$Type HTTP) sa)]
      prx)))

(defn url-do [url method defaults & body]
  "Connect to given url, execute given HTTP method. defauls may be empty {}, body is only
   used for POST requests"
  (binding [ http-defaults (merge http-defaults defaults) ]
    (let [p (create-proxy (http-defaults :http-proxy))
          u (new URL url)]
      (when (. (. url toLowerCase) startsWith "https")
        (let [sc (. SSLContext getInstance "SSLv3")
              tma (new httpclient.CljTrustManager)
              ar (make-array httpclient.CljTrustManager 1)]
          (aset ar 0 tma)
          (. sc (init nil ar nil))
          (. HttpsURLConnection setDefaultSSLSocketFactory (. sc getSocketFactory))
          (. HttpsURLConnection setDefaultHostnameVerifier (new httpclient.CljHostnameVerifier))))

      (let [ conn (. u openConnection p) ]
        (doto conn
          (setRequestMethod method)
          (setUseCaches (http-defaults :use-caches))
          (setReadTimeout (http-defaults :read-timeout))
          (setInstanceFollowRedirects (http-defaults :follow-redirects))
          (setConnectTimeout (http-defaults :connect-timeout)))
        (doseq [kw (keys (http-defaults :set-header))]
          (. conn (setRequestProperty kw (get (http-defaults :set-header) kw))))
        (when (= method "POST")
          (when (nil? body) (throw (new Exception "need body for post request")))
          (do
            (. conn setDoOutput true)
            (doto (new OutputStreamWriter (. conn getOutputStream ))
                  (write (nth body 0) 0 (count (nth body 0)))   (flush) (close))))
        (let [rcode (. conn getResponseCode)
              rmsg (. conn getResponseMessage)
              is (. conn getInputStream)
              buf-len 8192
              buf (make-array (. Byte TYPE) buf-len)
              #^ByteArrayOutputStream bos (new ByteArrayOutputStream)
              headers (into {} (. conn getHeaderFields))]
          (loop []
            (let [ nread (. is (read buf 0 buf-len)) ]
              (when (> nread -1)
                (. bos (write buf 0 nread))
                (recur))))
          (let [ bar (. bos toByteArray) ]
            (. is close)
            (. bos close)
            (. conn disconnect)
            {:return-code rcode :return-message rmsg :body bar :header headers}))))))

(comment
(pr (new String ((url-do "https://google.com" "GET" { :read-timeout 5000  }) :body) "UTF-8"))
(pr (url-do "http://google.com/" "GET" { :read-timeout 5000 :connect-timeout 1000 :http-proxy "http://127.0.0.1:3128/" }))
(pr (url-do "https://google.com/" "GET" { :read-timeout 5000 :connect-timeout 1000 :http-proxy "http://127.0.0.1:3128/" }))
)

I should probably submit this back to the group, but again, newb (and running against r1097).