Wednesday, September 30, 2009

Outside-In Ingredients

‹prev | My Chain | next›

Running the Cucumber scenario from yesterday, I find that all of the steps are currently undefined:
jaynestown% cucumber features/ingredient_index.feature:7 -s
Sinatra::Test is deprecated; use Rack::Test instead.
Feature: Ingredient index for recipes

As a user curious about ingredients or recipes
I want to see a list of ingredients
So that I can see a sample of recipes in the cookbook using a particular ingredient

Scenario: A couple of recipes sharing an ingredient
Given a "Cookie" recipe with "butter" and "chocolate chips"
And a "Pancake" recipe with "flour" and "chocolate chips"
When I visit the ingredients page
Then I should see the "chocolate chips" ingredient
And "chocolate chips" recipes should include "Cookie" and "Pancake"
And I should see the "flour" ingredient
And "flour" recipes should include only "Pancake"

1 scenario (1 undefined)
7 steps (7 undefined)
0m0.540s

You can implement step definitions for undefined steps with these snippets:

Given /^a "([^\"]*)" recipe with "([^\"]*)" and "([^\"]*)"$/ do |arg1, arg2, arg3|
pending
end

When /^I visit the ingredients page$/ do
pending
end

...
The first two steps in that scenario can be defined with:
Given /^a "([^\"]*)" recipe with "([^\"]*)" and "([^\"]*)"$/ do |title, ing1, ing2|
date = Date.new(2009, 9, 30)
permalink = date.to_s + "-" + title.downcase.gsub(/\W/, '-')

recipe = {
:title => title,
:type => 'Recipe',
:published => true,
:date => date,
:preparations => [{'ingredient' => {'name' => ing1}},
{'ingredient' => {'name' => ing2}}]
}

RestClient.put "#{@@db}/#{permalink}",
recipe.to_json,
:content_type => 'application/json'
end
Nothing too difficult in there—build a hash describing the recipe and then putting into the CouchDB database.

Next, I define the step to visit the ingredient index page that I created yesterday:
When /^I visit the ingredients page$/ do
visit "/ingredients"
end
Easy enough. Except:
jaynestown% cucumber features/ingredient_index.feature
Sinatra::Test is deprecated; use Rack::Test instead.
Feature: Ingredient index for recipes

As a user curious about ingredients or recipes
I want to see a list of ingredients
So that I can see a sample of recipes in the cookbook using a particular ingredient

Scenario: A couple of recipes sharing an ingredient # features/ingredient_index.feature:7
Given a "Cookie" recipe with "butter" and "chocolate chips" # features/step_definitions/ingredient_index.rb:1
And a "Pancake" recipe with "flour" and "chocolate chips" # features/step_definitions/ingredient_index.rb:1
When I visit the ingredients page # features/step_definitions/ingredient_index.rb:22
Resource not found (RestClient::ResourceNotFound)
/usr/lib/ruby/1.8/net/http.rb:543:in `start'
./features/support/../../eee.rb:227:in `GET /ingredients'
(eval):7:in `get'
features/ingredient_index.feature:11:in `When I visit the ingredients page'
Then I should see the "chocolate chips" ingredient # features/ingredient_index.feature:12
And "chocolate chips" recipes should include "Cookie" and "Pancake" # features/ingredient_index.feature:13
And I should see the "flour" ingredient # features/ingredient_index.feature:14
And "flour" recipes should include only "Pancake" # features/ingredient_index.feature:15
Ah, the RestClient::ResourceNotFound exception is being raised because I have not put the CouchDB map/reduce required by the Sinatra action:
get '/ingredients' do
url = "#{@@db}/_design/recipes/_view/by_ingredients?group=true"
data = RestClient.get url
@ingredients = JSON.parse(data)['rows']

"<title>EEE Cooks: Ingredient Index</title>"
end
I am using the couch_docs gem to load design documents into the CouchDB test database, so all I need to do is create couch/_design/recipes/views/by_ingredients/map.js:
function (doc) {
if (doc['published']) {
for (var i in doc['preparations']) {
var ingredient = doc['preparations'][i]['ingredient']['name'];
var value = [doc['_id'], doc['title']];
emit(ingredient, {"id":doc['_id'],"title":doc['title']});
}
}
}
And couch/_design/recipes/views/by_ingredients/reduce.js:
function(keys, values, rereduce) {
if (rereduce) {
var ret = [];
for (var i=0; i<values.length; i++) {
ret = ret.concat(values[i]);
}
return ret;
}
else {
return values;
}
}
With that, I have the first three steps in this scenario passing. That means it is time to work my way into the code to start implementing the next step. The Sinatra action is already done (yesterday), so it is time to start on the Haml template:
describe "ingredients.haml" do
#...
end
From my work the other night, I know that the CouchDB view will return an ingredient list of the form:
  before(:each) do
@ingredients = [{'butter' =>
[
['recipe-id-1', 'title 1'],
['recipe-id-2', 'title 2']
]
},
{'sugar' =>
[
['recipe-id-2', 'title 2']
]
}]
end
Given that, I expect to find a list of two ingredients:
  it "should have a list fo ingredients" do
render("/views/ingredients.haml")
response.should have_selector("p .ingredient", :count => 2)
end
I can get that to pass with:
= @ingredients.each do |ingredient|
%p
%span.ingredient
= ingredient.keys.first
Last up tonight, I would like to include the recipe titles in the Haml output:
  it "should have a list of recipes using the ingredients" do
render("/views/ingredients.haml")
response.should have_selector("p", :content => 'title 1, title 2')
end
I can get that passing with:
= @ingredients.each do |ingredient|
%p
%span.ingredient
= ingredient.keys.first
%span.recipes
= ingredient.values.first.map{|recipe| recipe[1]}.join(", ")
At this point, I am well on my way to completing the template. Hopefully I can finish it off tomorrow.

Tuesday, September 29, 2009

The Ingredient Index Feature

‹prev | My Chain | next›

I make a small offering to the gods of my chain tonight with a Cucumber feature describing the ingredient index feature:
Feature: Ingredient index for recipes

As a user curious about ingredients or recipes
I want to see a list of ingredients
So that I can see a sample of recipes in the cookbook using a particular ingredient

Scenario: A couple of recipes sharing an ingredient

Given a "Cookie" recipe with "butter and chocolate chips"
And a "Pancake" recipe with "flour and chocolate chips"
When I visit the ingredients page
Then I should see the "chocolate chips" ingredient
And "chocolate chips" recipes should include "Cookie" and "Pancake"
And I should see the "flour" ingredient
And "flour" recipes should include only "Pancake"

Scenario: Scores of recipes sharing an ingredient

Given 120 recipes with "butter"
When I visit the ingredients page
Then I should not see the "butter" ingredient
I mostly need these scenarios so that I can verify the full stack of Sinatra and CouchDB is working after the detailed implementation is driven by unit examples.

Additionally, I would like to exclude ingredients that appear in scores of recipes (the second scenario). In the past two legacy incarnation of the site, we excluded them simply because an ingredient with 100+ recipes tends to throw off the page layout. If need be, a separate page can be established to hold those recipes. This may also help me avoid potential performance issues in the CouchDB map-reduce as pointed out in the comments yesterday.

Although I lack time today, I can at least drive the imeplementation of the Sinatra action with these examples:
  it "should respond OK" do
get "/ingredients"
last_response.should be_ok
end

it "should ask CouchDB for a list of ingredients" do
RestClient.
should_receive(:get).
with(%r{by_ingredients}).
and_return('{"rows": [] }')

get "/ingredients"
end

it "should be named \"Ingredient Index\"" do
get "/ingredients"
last_response.
should have_selector("title", :content => "EEE Cooks: Ingredient Index")
end
The simplest thing that can possibly work to ensure that the action responds OK, makes a RestClient call to CouchDB, and has a title tag is:
get '/ingredients' do
url = "#{@@db}/_design/recipes/_view/by_ingredients?group=true"
data = RestClient.get url
@ingredients = JSON.parse(data)['rows']

"<title>EEE Cooks: Ingredient Index</title>"
end
That's a fine stopping point for tonight. I will pick up tomorrow with the Haml template.

Monday, September 28, 2009

Prototyping CouchDB Views (Take 2)

‹prev | My Chain | next›

I have CouchDB recipe documents that look like:
{
"_id": "2008-07-10-choccupcake",
"_rev": "1-1305119212",
"prep_time": 10,
"title": "Mini Chocolate Cupcakes with White Chocolate Chips",
"published": true,
//...

"date": "2008-07-10",
"type": "Recipe",
//...
"preparations": [
{
"brand": "",
"quantity": 1,
"unit": "cup",
"order_number": 1,
"description": "",
"ingredient": {
"name": "flour",
"kind": "all-purpose"
}
},
{
"brand": "",
"quantity": 0.5,
"unit": "cup",
"order_number": 2,
"description": "",
"ingredient": {
"name": "sugar",
"kind": "white, granulated"
}
},
//...
],
//...
}
With hundreds of recipes of this form, I would like a CouchDB view to give me a list of recipes by ingredient. Ultimately, I would like to generate a page like this (from the legacy Rails app):



I spiked this approach a while back. Looking back on that spike, I realize that I missed a couple of things: (1) the map function includes recipes that have not been published and (2) the reduce function does not deal with re-reduce (combining intermediate reduced steps).

The map that I used previously was:
function (doc) {
for (var i in doc['preparations']) {
var ingredient = doc['preparations'][i]['ingredient']['name'];
var value = [doc['_id'], doc['title']];
emit(ingredient, value);
}
}
That map function reads: for each preparation instruction, pull out the ingredient name and the document ID/title. The former is used to key the map-reduce. The latter will be used to create the links on the web page. As I mentioned this map function includes unpublished recipes. To exclude them, I need to add a simple conditional:
function (doc) {
if (doc['published']) {
for (var i in doc['preparations']) {
var ingredient = doc['preparations'][i]['ingredient']['name'];
var value = [doc['_id'], doc['title']];
emit(ingredient, value);
}
}
}
That produces results like this for the ingredient "apples":
"apples":
["2002-09-18-gingerapple", "Ginger Apple Crisps"],
"apples":
["2003-03-11-pecan", "Pecan Apple Tart"],
"apples":
["2003-12-07-applesauce", "Applesauce"],
"apples":
["2004-11-25-apple", "Apple Pie"]
To reduce that to a list of recipe IDs/title grouped by the "apples" ingredient, I had been using this function:
function(keys, values, rereduce) {
return values;
}
This works for ingredients that are only in a few recipes (like "apples"):
"apple":[["2004-11-25-apple", "Apple Pie"],
["2003-12-07-applesauce", "Applesauce"],
["2003-03-11-pecan", "Pecan Apple Tart"],
["2002-09-18-gingerapple", "Ginger Apple Crisps"]]
For ingredients that are in lots of recipes, a single pass through the map function is not sufficient. Instead, CouchDB generates several arrays of the values (arrays of arrays in this case). Since I am doing nothing different when CouchDB calls my map function with the reduce flag set, I end up returning those arrays of arrays:
"butter": 
[[["2002-02-08-buffalo_chicken", "Buffalo Chicken Sandwich"],
["2002-02-13-mushroom_pasta", "Pasta with Mushroom Gruyere Sauce"],
["2002-02-17-sausage_pie", "Chicken Sausage Pot Pie"],
["2002-03-12-asparagus_omelet", "Asparagus Omelet"],
["2002-03-12-cinnamon_toast", "Cinnamon Toast"],
["2002-03-14-veal_cutlets", "Breaded Veal Scallopini"],
["2002-03-20-mushroom_pasta", "Mushroom Chicken Pasta"],
["2002-04-09-pasta_primavera", "Pasta Primavera"],
["2002-04-13-chicken", "Breaded Baked Chicken"],
["2002-04-15-cajun_shrimp", "Cajun Shrimp"],
["2002-04-30-chicken_stew", "Chicken Andouille Stew"],
["2002-05-19-crabcake", "Maryland Crab Cakes"],
["2002-05-20-eggs", "Scrambled Eggs with Spinach and Bacon"]],
[[["2003-04-14-sandwich", "Batter-Dipped Ham and Cheese Sandwich"],
["2003-04-25-chicken", "Mustard Seed Chicken with Ginger Orange Sauce"],
//...
To avoid this undesirable outcome, I need to handle re-reduces when CouchDB calls my map functiona second time:
function(keys, values, rereduce) {
if (rereduce) {
var ret = [];
for (var i=0; i<values.length; i++) {
ret = ret.concat(values[i]);
}
return ret;
}

else {
return values;
}
}
This produces the results that I desire, a flat array of arrays:
"butter" 
[["2002-02-08-buffalo_chicken", "Buffalo Chicken Sandwich"],
["2002-02-13-mushroom_pasta", "Pasta with Mushroom Gruyere Sauce"],
["2002-02-17-sausage_pie", "Chicken Sausage Pot Pie"],
["2002-03-12-asparagus_omelet", "Asparagus Omelet"],
["2002-03-12-cinnamon_toast", "Cinnamon Toast"],
["2002-03-14-veal_cutlets", "Breaded Veal Scallopini"],
["2002-03-20-mushroom_pasta", "Mushroom Chicken Pasta"],
["2002-04-09-pasta_primavera", "Pasta Primavera"],
["2002-04-13-chicken", "Breaded Baked Chicken"],
["2002-04-15-cajun_shrimp", "Cajun Shrimp"],
["2002-04-30-chicken_stew", "Chicken Andouille Stew"],
["2002-05-19-crabcake", "Maryland Crab Cakes"],
["2002-05-20-eggs", "Scrambled Eggs with Spinach and Bacon"],
["2003-04-14-sandwich", "Batter-Dipped Ham and Cheese Sandwich"],
["2003-04-25-chicken", "Mustard Seed Chicken with Ginger Orange Sauce"],
...
I will stop there for the day. Now that I know the format of the output (and that it will work as I desire), I can drive the implementation of the ingredient index tomorrow.

Sunday, September 27, 2009

A Poor Man's CMS on CouchDB

‹prev | My Chain | next›

Today, I continue my effort to add a poor man's CMS into my Sinatra / CouchDB application. I need to get the catch-all resource to convert the content in a CouchDB document from textile into HTML. An RSpec example describing this:
  context "with an associated CouchDB document" do
before(:each) do
RestClient.
stub!(:get).
and_return('{"content":"*bar*"}')
end

it "should convert textile in the CouchDB doc to HTML" do
get "/foo-bar"
last_response.
should have_selector("strong", :content => "bar")
end
end
To get that to pass, I parse the JSON from CouchDB and use RedCloth to convert the textile in the example into HTML:
get %r{^/([\-\w]+)$} do |doc_id|
begin
data = RestClient.get "#{@@db}/#{doc_id}"
@doc = JSON.parse(data)
rescue RestClient::ResourceNotFound
pass
end
RedCloth.new(@doc['content']).to_html
end
With that working, I would like to insert that HTML into the normal page layout of the site:
    it "should insert the document into the normal site layout" do
get "/foo-bar"
last_response.
should have_selector("title", :content => "EEE Cooks")
end
Getting that to pass is quite easy as Sinatra's haml method is capable of doing just this:
get %r{^/([\-\w]+)$} do |doc_id|
begin
data = RestClient.get "#{@@db}/#{doc_id}"
@doc = JSON.parse(data)
rescue RestClient::ResourceNotFound
pass
end
haml RedCloth.new(@doc['content']).to_html
end
I contemplate adding a feature that will suppress document without a published attribute or add wiki-like intra-document linking ability. Eventually, I come to my senses and listen to the voice whispering YAGNI in my ear. I want a poor-man's CMS, nothing more.

So that should do the trick. To verify that it really is working, I check out the /about-us resource currently in the application:



As expected, it is not found. With my poor-man's CouchDB-based CMS in place, I need to add a CouchDB document with an about-us document ID:



With that, I can then reload the /about-us resource and:



With that mini-feature complete, I will move on to implementing an ingredient index page, which should be quite easy to do with CouchDB map-reduce.

Saturday, September 26, 2009

CouchDB is not a CMS

‹prev | My Chain | next›

On the legacy site, we have several content pages that have yet to make it into the new Sinatra / CouchDB site. The pages include things like an "About this Site" page, an "About Us" page, and pages describing our favorite restaurants.

I could store such pages in the /public directory, but then I would be duplicating headers and footers on each of those pages. As with any duplication, I would have to remember to update the header and footer on each of these static files any time the header/footer Haml partials are updated.

I would like to store these pages in CouchDB, preferably as Textile in a JSON attribute. If the document exists, then the textile would be converted and presented. If the document does not exist, then a 404 would be returned.

To drive this feature, I will describe the fictional resource /foo-bar in an RSpec example:
describe "GET /foo-bar" do
context "without an associated CouchDB document" do
it "should not be found" do
get "/foo-bar"
last_response.status.should == 404
end
end
end
That example passes because Sinatra does not know that resource.

To describe the case when there is a backing CouchDB document, I use this example:
  context "with an associated CouchDB document" do
it "should be found" do
RestClient.stub!(:get).and_return("{}")
get "/foo-bar"
last_response.should be_ok
end
end
end
That example fails because I have no resource to handle it. To get it passing, I need a Sinatra action that responds to something resembling CouchDB resource IDs. Something like this ought to be a good start:
get %r{^/([\-\w]+)$} do |doc_id|
"foo"
end
The textile-couchdb-doc-should-be-found example now passes, but the previous, with-no-corresponding-couchdb-doc example now fails:
1)
'GET /foo-bar without an associated CouchDB document should not be found' FAILED
expected: 404,
got: 200 (using ==)
./spec/eee_spec.rb:605:
The simplest thing that can possibly appease both of these cases is to make a RestClient.get request to the CouchDB resource and, if that fails, to pass handling back to another resource (possibly to be not found anywhere). In other words:
get %r{^/([\-\w]+)$} do |doc_id|
begin
RestClient.get "#{@@db}/#{doc_id}"
rescue RestClient::ResourceNotFound
pass
end
"foo"
end
The pass method comes from Sinatra. It indicates that the currently executing resource does not handle the current request. It passes handling back to Sinatra (in case another resource in the Sinatra application matches) or back to Rack (in case another Rack application is capable of handling this request).

With those two examples passing, I call it a day. I will pick back up tomorrow getting the textile in the document converted to HTML and inserted into the normal Haml layout.

Friday, September 25, 2009

Rack Caching: No Longer Just a Proof of Concept

‹prev | My Chain | next›

Tonight I would like to implement and deploy the Rack:Cache support that I have been prototyping the past few days. Since I have been prototyping, my first step is to delete all of the non-commited work.

Next, I need to write a spec that describes the behavior that I desire. Consider a Sinatra action that renders cookbook meals similar to:
get %r{/meals/(\d+)/(\d+)/(\d+)} do |year, month, day|
data = RestClient.get "#{@@db}/#{year}-#{month}-#{day}"
@meal = JSON.parse(data)

url = "#{@@db}/_design/meals/_view/by_date_short"
data = RestClient.get url
@meals_by_date = JSON.parse(data)['rows']

@recipes = @meal['menu'].map { |m| wiki_recipe(m) }.compact

@url = request.url

haml :meal
end
If the client (or Rack::Cache) supplies an If-None-Match value that matches the current CouchDB document requested on the first line in that action, then I would like to halt, by-passing any remaining processing. In such a case, RestClient.get would be invoked only once (on the first line). Expressing this in RSpec format:
  context "cached documents" do
describe 'GET /meals/2009/09/25' do
it "should etag with the CouchDB document's revision" do
RestClient.
should_receive(:get).
once.
and_return('{"_rev":1234}')

get "/meals/2009/09/25", { }, { 'HTTP_IF_NONE_MATCH' => '"1234"' }
end
end
end
When I execute that example, I get a failure because the meal document (the JSON described by and_return) is insufficient to describe a meal as needed in the Haml template:
1)
NoMethodError in 'eee cached documents GET /meals/2009/09/25 should etag with the CouchDB document's revision'
undefined method `map' for nil:NilClass
./eee.rb:85:in `GET (?-mix:\/meals\/(\d+)\/(\d+)\/(\d+))'
I get the example to pass by adding a Sinatra etag call to the action:
get %r{/meals/(\d+)/(\d+)/(\d+)} do |year, month, day|
data = RestClient.get "#{@@db}/#{year}-#{month}-#{day}"
@meal = JSON.parse(data)
etag(@meal['_rev'])

url = "#{@@db}/_design/meals/_view/by_date_short"
data = RestClient.get url
@meals_by_date = JSON.parse(data)['rows']

@recipes = @meal['menu'].map { |m| wiki_recipe(m) }.compact

@url = request.url

haml :meal
end
Since that passes, I know that RestClient is only called once. So the upstream (e.g. Rack::Cache) cached copy will be use instead of regenerating everything again.

After following the same procedure with the cookbook's recipes action, I am done implementing (implementation was much less than the learning that I needed). So I am ready to deploy.

My Rack configuration file has Rack::Cache configured to store its cache in /var/cache/rack. After ensuring that the Rack::Cache gem is installed on my beta site and that the /var/cache/rack directory exists and is writable by the Thin server's owner, I am ready to deploy:
jaynestown% rake vlad:stop_app
jaynestown% rake vlad:update vlad:migrate vlad:start_app
To verify that caching is working I tail my CouchDB logs on the beta site as I access a mini-chocolate cupcakes recipe. The first time I access the recipe, I see the request for the recipe document itself, the recipe photo (which is stored in CouchDB), and several views:
[Sat, 26 Sep 2009 03:30:08 GMT] [info] [<0.14733.137>] 127.0.0.1 - - 'GET' /eee/2008-07-10-choccupcake 200
[Sat, 26 Sep 2009 03:30:08 GMT] [info] [<0.14800.137>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/by_date_short 200
[Sat, 26 Sep 2009 03:30:09 GMT] [info] [<0.14801.137>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/updated_by?key=%222008-07-10-choccupcake%22 200
[Sat, 26 Sep 2009 03:30:09 GMT] [info] [<0.14802.137>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/update_of?key=%222008-07-10-choccupcake%22 200
[Sat, 26 Sep 2009 03:30:09 GMT] [info] [<0.14803.137>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/alternatives?key=%222008-07-10-choccupcake%22 200
[Sat, 26 Sep 2009 03:30:09 GMT] [info] [<0.14804.137>] 127.0.0.1 - - 'GET' /eee/2008-07-10-choccupcake/cupcake_0043.jpg 200
As expected, subsequent requests only retrieve the recipe document to check the CouchDB revision against the cache:
[Sat, 26 Sep 2009 03:30:17 GMT] [info] [<0.14805.137>] 127.0.0.1 - - 'GET' /eee/2008-07-10-choccupcake 200
[Sat, 26 Sep 2009 03:30:17 GMT] [info] [<0.14815.137>] 127.0.0.1 - - 'GET' /eee/2008-07-10-choccupcake/cupcake_0043.jpg 200
[Sat, 26 Sep 2009 03:30:19 GMT] [info] [<0.14816.137>] 127.0.0.1 - - 'GET' /eee/2008-07-10-choccupcake 200
[Sat, 26 Sep 2009 03:30:19 GMT] [info] [<0.14821.137>] 127.0.0.1 - - 'GET' /eee/2008-07-10-choccupcake/cupcake_0043.jpg 200
The recipe and meal photos are also good candidates for caching via Rack:Cache, so I will likely pick up with that tomorrow.

Thursday, September 24, 2009

Full Stack ETag Support

‹prev | My Chain | next›

This is how I have am using Rack::Cache, Sinatra, and CouchDB:
             1. Web client       ^                        
Request | 5. Respond to client
| | and
| | Store in
+---+-------------------+-+ /-----> file system
| | Rack::Cache +-+--- cache
+---+-------------------+-+
| |
+---v-------------------+-+
| |
| Sinatra |
| +----------+
+----+ | |
| +-------------------------+ |
| ^ \
| 2. RestClient | 3. Response |
| Request | _rev: 1234 |
| | |
| +----------------+--------+ \
+--->| | |
| CouchDB |<-----------+
| | 4. Ancillary
| | Requests
+-------------------------+
The nice thing about this stack is that it is all web-based, which will allow me to make certain assumptions when optimizing.

Yesterday, I was able to by-pass step #4 in that diagram which should cut down significantly on the total request time. I used the _rev (revision) attribute returned from CouchDB in step #3 as the argument to Sinatra's etag method. Rack::Cache, in turn, uses that value to decide whether it can use a previously stored cached copy of the HTML generated by Sinatra from the assembled bits of several CouchDB requests.

The action in question:
get '/recipes/:permalink' do
data = RestClient.get "#{@@db}/#{params[:permalink]}"
@recipe = JSON.parse(data)
etag @recipe['_rev']

url = "#{@@db}/_design/recipes/_view/by_date_short"
data = RestClient.get url
@recipes_by_date = JSON.parse(data)['rows']

@url = request.url

haml :recipe
end
If the _rev of the CouchDB recipe document matches the ETag of the HTML document stored in cached, all other processing stops and the cached copy is immediately returned. If they do not match, a new document is generated.

As of yesterday, with the code above, I have that working. What I would like to accomplish today is skipping step #2. In the case that the web browser already has a cached copy of the web page (and hence knows the HTML document's ETag), why bother requesting the entire document from CouchDB? As long as the ETag and CouchDB _rev match, the request life cycle should stay very close to the top of that diagram.

In order to make that happen, I need the RestClient call at the start of the Sinatra action to supply the If-None-Match HTTP request header attribute that corresponds to the ETag response header attribute.

RestClient supports request attributes via optional second argument to the get method. To tell CouchDB to only return a recipe if has been updated, I can use this form:
>> RestClient.get 'http://localhost:5984/eee/2001-09-02-potatoes', :if_none_match =>  "2-2471836896"
=> "{"_id":"2001-09-02-potatoes",
"_rev":"2-2471836896",
"prep_time":10,
"title":"Roasted Potatoes"
...
}
Hmmm... Well, at least I think I should be able to use that form. I am not sure what is going wrong there, but the entire document is being returned. After other troubleshooting fails, I drop down to packet sniffing with tcpdump:
jaynestown% sudo tcpdump -i lo port 5984 -A -s3000
05:48:49.278717 IP localhost.53793 > localhost.5984: P 1:150(149) ack 1 win 513
E...L5@.@............!.`.y.
...............
.[.:.[.:GET /eee/2001-09-02-potatoes HTTP/1.1
If-None-Match: 2-2471836896
Accept: application/xml
Accept-Encoding: gzip, deflate
Host: localhost:5984


05:48:49.278736 IP localhost.5984 > localhost.53793: . ack 150 win 190
E..4;.@.@.. .........`.!.....y.............
.[.:.[.:
05:48:49.282277 IP localhost.5984 > localhost.53793: P 1:221(220) ack 150 win 192
E...;.@.@..,.........`.!.....y.............
.[.;.[.:HTTP/1.1 200 OK
Server: CouchDB/0.9.0a756286 (Erlang OTP/R12B)
Etag: "2-2471836896"
Date: Thu, 24 Sep 2009 09:48:49 GMT
Content-Type: text/plain;charset=utf-8
Content-Length: 1908
Cache-Control: must-revalidate


05:48:49.282309 IP localhost.53793 > localhost.5984: . ack 221 win 530
E..4L6@.@............!.`.y....._.....b.....
.[.;.[.;
05:48:49.282352 IP localhost.5984 > localhost.53793: P 221:2129(1908) ack 150 win 192
E...;.@.@............`.!..._.y.............
.[.;.[.;{"_id":"2001-09-02-potatoes","_rev":"2-2471836896","prep_time":10,"title":"Roasted Potatoes",...
Now c'mon! The request header attribute is being set correctly. It is the same as the ETag and the CouchDB _rev. What am I missing?!

After much head banging, I realize that it is the quotes that I am missing:
>> RestClient.get 'http://localhost:5984/eee/2001-09-02-potatoes', :if_none_match =>  '"2-2471836896"'
RestClient::NotModified: RestClient::NotModified
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient/request.rb:189:in `process_result'
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient/request.rb:125:in `transmit'
from /usr/lib/ruby/1.8/net/http.rb:543:in `start'
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient/request.rb:123:in `transmit'
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient/request.rb:49:in `execute_inner'
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient/request.rb:39:in `execute'
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient/request.rb:17:in `execute'
from /usr/lib/ruby/gems/1.8/gems/rest-client-1.0.3/bin/../lib/restclient.rb:65:in `get'
from (irb):13
from :0
Interesting. I am not sure that this is an exceptional case, but I can certainly catch that exception and signal Rack::Cache to immediately send back its copy.

Just to be sure that I know what is happening I do check the output of tcpdump in this case. Indeed, the quotes are doing the trick:
jaynestown% sudo tcpdump -i lo port 5984 -A -s3000
05:49:10.661038 IP localhost.53802 > localhost.5984: P 1:152(151) ack 1 win 513
E.....@.@..m.........*.`.e....6............
.\...\..GET /eee/2001-09-02-potatoes HTTP/1.1
If-None-Match: "2-2471836896"
Accept: application/xml
Accept-Encoding: gzip, deflate
Host: localhost:5984

05:49:10.663062 IP localhost.5984 > localhost.53802: P 1:156(155) ack 152 win 192
E...5.@.@............`.*..6..e.?...........
.\...\..HTTP/1.1 304 Not Modified
Server: CouchDB/0.9.0a756286 (Erlang OTP/R12B)
Etag: "2-2471836896"
Date: Thu, 24 Sep 2009 09:49:10 GMT
Content-Length: 0
Now that I understand how to make proper RestClient.get calls with a If-None-Match header attribute, I can wrap it in a begin/rescue block in my Sinatra action:
get '/recipes/:permalink' do
data =
begin
RestClient.get "#{@@db}/#{params[:permalink]}",
:if_none_match => request.env["HTTP_IF_NONE_MATCH"]
rescue RestClient::NotModified
etag request.env["HTTP_IF_NONE_MATCH"].gsub(/"/, '')
end


@recipe = JSON.parse(data)
etag @recipe['_rev']

url = "#{@@db}/_design/recipes/_view/by_date_short"
data = RestClient.get url
@recipes_by_date = JSON.parse(data)['rows']

@url = request.url

haml :recipe
end
That behaves as I expected, but what does this all mean? To answer that, I break out Apache Bench to measure response times:
# Access the rack app with Rack::Cache and with full stack etag support:
ab -H "If-None-Match: '2-2471836896'" -n 100 http://localhost:9292/recipes/2001-09-02-potatoes
# Access the rack app with Rack::Cache, but without full stack etag support:
ab -n 100 http://localhost:9292/recipes/2001-09-02-potatoes
# Access the Thin server (no Rack::Cache, no etag support):
ab -n 100 http://localhost:4567/recipes/2001-09-02-potatoes
The results:
StackAverage Req./sec
Full stack etag133.59
Rack::Cache106.01
No etag/no cache10.71
The conclusion that I draw is that I definitely want to use Rack::Cache—100% improvement over reassembling the HTML on each request is too good to pass up. As for the 20% speed boost that full stack ETag buys me, I am not sure that the complexity that is introduced warrants the speed boost. If nothing else, it is worth considering in certain cases.

Wednesday, September 23, 2009

Rack::Cache

‹prev | My Chain | next›

The last time I benchmarked the new site, it was running a little slow. Nothing too serious, but I think I can do better.

Currently, with each request of a recipe or a meal through my Sinatra application:
127.0.0.1 - - [23/Sep/2009 21:24:09] "GET /recipes/2001-09-02-potatoes HTTP/1.1" 200 6936 0.6164
There are multiple CouchDB requests:
[info] [<0.3275.10>] 127.0.0.1 - - 'GET' /eee/2001-09-02-potatoes 200
[info] [<0.5733.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/by_date_short 200
[info] [<0.5734.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/updated_by?key=%222001-09-02-potatoes%22 200
[info] [<0.5737.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/update_of?key=%222001-09-02-potatoes%22 200
[info] [<0.5738.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/alternatives?key=%222001-09-02-potatoes%22 200
[info] [<0.5739.10>] 127.0.0.1 - - 'GET' /eee/2001-09-02-potatoes/roasted_potatoes.jpg 200
[info] [<0.5740.10>] 127.0.0.1 - - 'GET' /eee/2001-09-02-potatoes/roasted_potatoes.jpg 200
There is the initial request to for the meal/recipe data itself, plus ancillary requests for next/previous records, records referenced within the data, etc.

I can avoid all of those requests if I cache the output after fulfilling the request the first time. Since I am using Rack, I can implement caching with Rack::Cache.

Rack::Cache uses ETag headers (among other things) to decide when to expire the cache. Since I am using CouchDB, I can use document revisions for the ETag value. In fact, this is what CouchDB itself does:
jaynestown% curl http://localhost:5984/eee/2001-09-02-potatoes -i
HTTP/1.1 200 OK
Server: CouchDB/0.9.0a756286 (Erlang OTP/R12B)
Etag: "1-1030813362"
Date: Thu, 24 Sep 2009 00:30:22 GMT
Content-Type: text/plain;charset=utf-8
Content-Length: 1923
Cache-Control: must-revalidate

{"_id":"2001-09-02-potatoes",
"_rev":"1-1030813362",
"prep_time":10,
"title":"Roasted Potatoes",
"published":true,
...}
To set an ETag, Sinatra provides a handy etag method. By default, it take a single argument specify the value to be used for the ETag:
get '/recipes/:permalink' do
data = RestClient.get "#{@@db}/#{params[:permalink]}"
@recipe = JSON.parse(data)
etag @recipe['_rev']

url = "#{@@db}/_design/recipes/_view/by_date_short"
data = RestClient.get url
@recipes_by_date = JSON.parse(data)['rows']

@url = request.url

haml :recipe
end
By placing the etag immediately after the recipe document is retrieved, I prevent execution of the remainder of the code in the action when the cache is still valid.

That gets the desired header attribute set:
jaynestown% curl http://localhost:4567/recipes/2001-09-02-potatoes -i
HTTP/1.1 200 OK
ETag: "1-1030813362"
Content-Type: text/html;charset=UTF-8
Content-Length: 6936
Connection: keep-alive
Server: thin 1.2.2 codename I Find Your Lack of Sauce Disturbing

<html>
<title>EEE Cooks</title>
<link href='/stylesheets/style.css' rel='stylesheet' type='text/css' />
</html>
...
To make use of that ETag attribute for server-side caching, I need Rack::Cache installed:
jaynestown% gem install rack-cache
WARNING: Installing to ~/.gem since /usr/lib/ruby/gems/1.8 and
/usr/bin aren't both writable.
Successfully installed rack-cache-0.5
1 gem installed
To use it, I add the appropriate use call to my rackup file:
require 'eee.rb'
require 'rubygems'
require 'sinatra'
require 'rack/cache'

use Rack::Cache,
:verbose => true,
:metastore => 'file:/tmp/cache/rack/meta',
:entitystore => 'file:/tmp/cache/rack/body'


root_dir = File.dirname(__FILE__)

set :environment, :development
set :root, root_dir
set :app_root, root_dir
set :app_file, File.join(root_dir, 'eee.rb')
disable :run

run Sinatra::Application
Since I am just trying this out in a spike, I am storing the cached files in /tmp. Were this the real thing, I would use a dedicated, non-volatile filesystem like /var.

Also for spike purposes, I start the application directly (I would use rack-aware Thin in a live situation):
jaynestown% rackup config.ru
When I access the document, I see Rack::Cache headers, so it would seem that it is working:
jaynestown% curl http://localhost:9292/recipes/2001-09-02-potatoes -i
HTTP/1.1 200 OK
Connection: close
Date: Thu, 24 Sep 2009 01:36:28 GMT
ETag: "1-1030813362"
X-Rack-Cache: miss, store
X-Content-Digest: f88a43e32dcfcbc15b7b91d760f761670cdd32eb

Content-Type: text/html;charset=UTF-8
Content-Length: 6936
Age: 0

<html>
<title>EEE Cooks</title>
<link href='/stylesheets/style.css' rel='stylesheet' type='text/css' />
</html>
...
If I access that same resource, Rack::Cache recognizes that that cache is still valid:
jaynestown% curl http://localhost:9292/recipes/2001-09-02-potatoes -i
HTTP/1.1 200 OK
Connection: close
Date: Thu, 24 Sep 2009 02:01:40 GMT
ETag: "1-1030813362"
X-Rack-Cache: stale, valid, store
X-Content-Digest: f88a43e32dcfcbc15b7b91d760f761670cdd32eb
Content-Type: text/html;charset=UTF-8
Content-Length: 6936
Age: 0
Better yet, when the URL is accessed, the main CouchDB document is accessed (to verify that the ETag has not changed), but no ancillary requests are made:
[info] [<0.8491.10>] 127.0.0.1 - - 'GET' /eee/2001-09-02-potatoes 200
To verify that cache expiry is working, I make a small update to the CouchDB document and retry:
jaynestown% curl http://localhost:9292/recipes/2001-09-02-potatoes -i
HTTP/1.1 200 OK
Connection: close
Date: Thu, 24 Sep 2009 02:03:52 GMT
ETag: "2-2471836896"
X-Rack-Cache: stale, invalid, store
X-Content-Digest: 66e55ff8cac5557e7ac73d6ac058a503c7ff5e8f

Content-Type: text/html;charset=UTF-8
Content-Length: 6938
Age: 0
Nice! As expected, the document itself is retrieved from CouchDB, as are the ancillary resources:
[info] [<0.8678.10>] 127.0.0.1 - - 'GET' /eee/2001-09-02-potatoes 200
[info] [<0.8685.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/by_date_short 200
[info] [<0.8686.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/updated_by?key=%222001-09-02-potatoes%22 200
[info] [<0.8689.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/update_of?key=%222001-09-02-potatoes%22 200
[info] [<0.8690.10>] 127.0.0.1 - - 'GET' /eee/_design/recipes/_view/alternatives?key=%222001-09-02-potatoes%22 200
Tomorrow, I will further explore the REST-like nature of CouchDB. As mentioned earlier, CouchDB has ETag support. I should be able to use that support to do conditional retrieval from inside the Sinatra application. I will implement that and then do some benchmarks to see how much of a performance boost I can achieve.

Tuesday, September 22, 2009

Minor Details, The End?

‹prev | My Chain | next›

More minor little details tonight. First up, links to Amazon.com cookbooks in Haml:
%div
Find cookbooks from
%a{:href => "http://www.amazon.com/exec/obidos/redirect-home/#{AMAZON_ASSOCIATE_ID}"}><
Amazon.com
\:
That produces slightly less than optimal display:



Sigh. Haml is awesome for some things, but messing about with whitespace is not one of them. I want to remove whitespace from after the link to Amazon.com (so that the colon is immediately after the link). There should be whitespace after the text "Find cookbooks from" and the link. There also should be no whitespace inside the link.

Haml uses a trailing > to remove whitespace around tags and a trailing < to remove whitespace inside the tag. There is a mnemonic to go with them, but I always have to look up which is which.

At any rate, I end up with this Haml to do the trick:
%div
Find cookbooks from
=" "
%a{:href => "http://www.amazon.com/exec/obidos/redirect-home/#{AMAZON_ASSOCIATE_ID}"}><
Amazon.com
\:
Like I said, Haml does not excel when dealing with whitespace.

To link to the actual cookbook searches, I create a helper with my Amazon referral ID embedded in it. Using that helper:
%div
Find cookbooks from
=" "
%a{:href => "http://www.amazon.com/exec/obidos/redirect-home/#{AMAZON_ASSOCIATE_ID}"}><
Amazon.com
\:
=" "
= @recipe['tag_names'].map{|tag| amazon_cookbook(tag)}.join(", ")
That gets the search links working as desired:




In addition to mucking with whitespace, I also get Google Ads placed at the bottom of the page.

That should just about do it for the minor cleanup. Tomorrow, I may do a little benchmarking and possibly explore implementing some caching on the site.

Monday, September 21, 2009

No Hard Breaks

‹prev | My Chain | next›

Aw, for crying out loud.

I again find myself clicking out the beta site only to find a little problem. Today's problem are weird line breaks:



Grrr... Every time I use RedCloth, I get the line breaks (or something) wrong.

Digging through the code, I have to give the maintainers credit—although the :no_breaks option is apparently deprecated, there is still an accessor for this option. But first, a (RSpec) test:
  it "should not have line breaks in paragraphs" do
textile = <<_TEXTILE
paragraph 1 *bold text*
paragraph 2
_TEXTILE

wiki(textile).
should_not have_selector("br")
end
As expected, that fails:
1)
'wiki should not have line breaks in paragraphs' FAILED
expected following output to omit a <br/>:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>paragraph 1 <strong>bold text</strong><br>
paragraph 2</p></body></html>
./spec/eee_helpers_spec.rb:72:
To get that to pass, I use the no_breaks accessor in the wiki helper:
    def wiki(original, convert_textile=true)
text = (original || '').dup
#...
if convert_textile
textile = RedCloth.new(text)
textile.hard_breaks = false
textile.to_html
else
text
end
end
That makes the example pass and fixes the display issues:



In addition to resolving my :hard_breaks RedCloth woes, I also spend a little time to get a link to a Google query working:



We are not trying to lock users into our site—it's just a family cookbook, so we like to include those links.

Up tomorrow, I have some Amazon associate work to do.

Sunday, September 20, 2009

Ingredients, with Style

‹prev | My Chain | next›

Work over the past couple of days has the ingredients list in this state:



Which is a definite improvement over this:



Still, there are a few more things that I would like to clean up. It would be nicer to have the "Ingredients" header not quite so large and closer to the list of ingredients:
h2 {
margin-bottom: 0px;
font-size: 1.2em;
}

ul {
margin-top: 0px;
}
I would also like a bit of space between the recipe meta information (the serving info, tools used, etc.):
#recipe-meta {
float: right;
width: 250px;
margin: 3px 8px;
}
I would like to give a little emphasis to the ingredient name to afford quick scanning by the user:
.preparations {
color: #555;
.brand {
font-style: italic;
}
.name {
color: black;
/* background-color: #FFFFe0; */
}
}
(That's less CSS format)

Lastly, while I am mucking with CSS, I hate having to read very wide text (when I maximize the window). The only part of the recipe that might go wide would be the text instructions below the ingredient list:
#instructions {
width: 50em;
}
With those changes, I have a much cleaner ingredient list:



I also add a footer to the site before finishing coding for the night. With a nice stopping point, I redeploy the app:
jaynestown% rake vlad:stop_app
jaynestown% rake vlad:update vlad:migrate vlad:start_app
I still have a couple of minor little tweaks in need of doing, so I may pick up with that tomorrow.

Saturday, September 19, 2009

Pretty Floats

‹prev | My Chain | next›

Today, I would like to get pretty printing of Float instances working. Specifically I would like:
  • 1.125.to_s == '1⅛'
  • 1.25.to_s == '1¼'
  • 0.875.to_s == '⅞'
First up, I describe existing behavior that should not change:
describe Float, "pretty printing" do
specify "1.23.to_s == '1.23'" do
1.23.to_s.should == "1.23"
end
As expected, that example passes without any changes. Next up, I describe precision. There is no reason to specify more than two decimal points when trying to measure things, so, in RSpec format:
  specify "1.236.to_s == '1.24'" do
1.236.to_s.should == "1.24"
end
That fails with:
1)
'Float to_s should only include 2 decimals precision' FAILED
expected: "1.24",
got: "1.236" (using ==)
./spec/float_spec.rb:8:
To get that to pass, I re-open the Float to override to_s:
class Float
def to_s
"%.2f" % self
end
end
The updated Float class is not automatically included in my Sinatra application (and thus in the RSpec examples). I save it in the lib directory, and then add this to the Sinatra application:
$: << File.expand_path(File.dirname(__FILE__) + '/lib')

require 'float'
For the remaining numbers that are likely to appear in recipes using imperial units of measure, I use these examples:
  specify "0.25.to_s == '¼'"
specify "0.5.to_s == '½'"
specify "0.75.to_s == '¾'"
specify "0.33.to_s == '⅓'"
specify "0.66.to_s == '⅔'"
specify "0.125.to_s == '⅛'"
specify "0.325.to_s == '⅜'"
specify "0.625.to_s == '⅝'"
specify "0.875.to_s == '⅞'"
I end up using a long case statement to get that passing:
class Float
def to_s
int = self.to_i
frac = pretty_fraction(self - int)

if frac
(int == 0 ? "" : int.to_s) + frac
else
"%.2f" % self
end
end

private

def pretty_fraction(fraction)
case fraction
when 0.25
"¼"
when 0.5
"½"
when 0.75
"¾"
when 0.33
"⅓"
when 0.66
"⅔"
when 0.125
"⅛"
when 0.325
"⅜"
when 0.625
"⅝"
when 0.875
"⅞"
end
end
end
In addition to pretty printing things that look like fractions, I also want to print things that look like integers:
  specify "1.0.to_s == '1'" do
1.0.to_s.should == "1"
end
I get that to pass with another condition on the case statement:
    case fraction
...
when 0
""
end
Last up is a need to account for the cases when the author goes overboard with precision. That is, what happens if I entered 0.67 or 0.667 to mean two-thirds. In RSpec:
  specify "0.33.to_s == '⅓'" do
0.33.to_s.should == "⅓"
end
specify "0.333.to_s == '⅓'" do
0.333.to_s.should == "⅓"
end
specify "0.66.to_s == '⅔'" do
0.66.to_s.should == "⅔"
end
specify "0.667.to_s == '⅔'" do
0.667.to_s.should == "⅔"
end
Rather than account for all possible cases, I use a range in my case statement:
  def pretty_fraction(fraction)
case fraction
...
when 0.33..0.34
"⅓"
when 0.66..0.67
"⅔"
...
end
end
This makes use of case statements performing a triple equality operation on the target and that:
>> (0.66..0.67) === 0.666
=> true
With my floats printing nicely, I have cleared up the most bothersome of the minor issues that I found on the beta site. Tomorrow, I will work through a few more issues and redeploy.

Friday, September 18, 2009

Minor Details, Part 1

‹prev | My Chain | next›

Clicking about the site, I notice a few minor problems. Mostly the little details that get added to a site after years of development. Most of them can be seen on this recipe screenshot:



On that page:
  • there is no title on the browser tab
  • there is no footer on the page
  • recipes do not include links at the bottom to search Google for similar recipes
  • recipes do not include links at the bottom to search Amazon for cookbooks that might contain similar recipes
I can live with most of those issues. More bothersome to me are problems with the ingredient list:



The list of ingredients:
  • has no heading
  • does not "wiki-fy" links (e.g. [recipe:2002/02/15/cocktail_sauce cocktail sauce] )
  • has decimal numbers instead of pretty fractions (¼, ½, ¾, etc.)
  • can use some additional CSS love—brand name (e.g. Honey Maid) should be de-emphasized, the text should not be quite so close to the "Servings and Times" and "Tools and Appliances" info boxes.
The first item on the list is easy enough to address—I add this to the recipe.haml template:
...
- if @recipe['preparations']
%h2= "Ingredients"
%ul.preparations
- @recipe['preparations'].each do |preparation|
%li.ingredient
...
I am not altering any behavior of the template nor am I affecting the structure of the resulting DOM. It is just an additional label. In other words, no tests needed.

Evaluating the recipe wiki text does describe a change in behavior, so off to my specs! I only want to ensure that the recipe name is wiki-fied, so a simple :should_receive ought to suffice in an RSpec example:
    it "should wiki-fy the link to the other recipe" do
self.should_receive(:wiki).
with("[recipe:2009/09/18/recipe Recipe]").
and_return(%Q|<a href="http://example.org/">Bar</a>|)

render("views/recipe.haml")
end
I make that pass by simply adding a wiki helper call to the ingredient name attribute:
...
- if @recipe['preparations']
%h2= "Ingredients"
%ul.preparations
- @recipe['preparations'].each do |preparation|
%li.ingredient
%span.quantity
= preparation['quantity']
%span.unit
= preparation['unit']
%span.kind
= preparation['ingredient']['kind']
%span.name
= wiki preparation['ingredient']['name']
- if preparation['brand'] =~ /\S/
%span.brand
= "(" + preparation['brand'] + ")"
...
That gets the spec passing. It definitely includes the link on the page now, but inside a <p> tag, which throws off the display:



That extraneous <p> tag is being added by the call to RedCloth in the wiki helper:
    def wiki(original)
text = (original || '').dup
text.gsub!(/\b(\d+)F/, "\\1° F")
text.gsub!(/\[kid:(\w+)\]/m) { |kid| kid_nicknames[$1] }
text.gsub!(/\[recipe:(\S+)\]/m) { |r| recipe_link($1) }
text.gsub!(/\[recipe:(\S+)\s(.+?)\]/m) { |r| recipe_link($1, $2) }
text.gsub!(/\[meal:(\S+)\]/m) { |m| meal_link($1) }
text.gsub!(/\[meal:(\S+)\s(.+?)\]/m) { |m| meal_link($1, $2) }
RedCloth.new(text).to_html
end
There is no need for RedCloth in this case, it would be nice to be able to tell wiki to not use it:
  it "should skip converting textile to HTML if arg2 is false" do
wiki("textile", false).
should_not have_selector("p")
end
That fails with the following error:
cstrom@jaynestown:~/repos/eee-code$ spec ./spec/eee_helpers_spec.rb 
...........F............................................................................

1)
ArgumentError in 'wiki should skip converting textile to HTML if arg2 is false'
wrong number of arguments (2 for 1)
./spec/eee_helpers_spec.rb:67:in `wiki'
./spec/eee_helpers_spec.rb:67:

Finished in 0.149237 seconds

88 examples, 1 failure
I change the message by adding an optional second argument to the wiki helper method:
    def wiki(original, convert_textile=true)
...
RedCloth.new(text).to_html
end
The spec still fails because I am still converting textile with RedCloth:
cstrom@jaynestown:~/repos/eee-code$ spec ./spec/eee_helpers_spec.rb 
...........F............................................................................

1)
'wiki should skip converting textile to HTML if arg2 is false' FAILED
expected following output to omit a <p/>:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>textile>/p></body></html>
./spec/eee_helpers_spec.rb:67:

Finished in 0.154897 seconds

88 examples, 1 failure
To make that pass, I use the second argument in a ternary to by-pass the RedCloth call:
    def wiki(original, convert_textile=true)
...
convert_textile ? RedCloth.new(text).to_html : text
end
That makes it pass, except that it doesn't pass:
cstrom@jaynestown:~/repos/eee-code$ spec ./spec/eee_helpers_spec.rb 
...........F............................................................................

1)
'wiki should skip converting textile to HTML if arg2 is false' FAILED
expected following output to omit a <p/>:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><p>textile>/p></body></html>
./spec/eee_helpers_spec.rb:67:

Finished in 0.154897 seconds

88 examples, 1 failure
Hunh? I changed that, how could it possibly be failing? It definitely works in the browser now:



So what gives?

It turns out to be added by RSpec itself. To work around, I change the expectation from omitting the <p/> tag to not invoking RedCloth. That is more implementation specific, but I am not too concerned—after all, I really am trying to avoid the RedCloth usage here:
  it "should skip converting textile to HTML if arg2 is false" do
RedCloth.should_not_receive(:new)
wiki("textile", false)
end
With specs passing and display looking good, I call it a day at this point. I will pick up tomorrow with the prettifying of the decimal numbers in the ingredient quantities.

Thursday, September 17, 2009

Deploying the Latest

‹prev | My Chain | next›

Picking up where I left off yesterday, I use these ./script/console methods to dump JSON documents describing recipe updates from the legacy Rails application:
# Traverse the linked list of recipe updates, building up the list
def update_list(recipe, list=[])
if recipe.is_replacement?
update_list(recipe.replacement_for,
list + [{:id => recipe.uri.gsub(/\//, '-')}])
else
list + [{:id => recipe.uri.gsub(/\//, '-')}]
end
end

# For every recipe update, dump a JSON update document that includes
# the entire linked list (from update_list)
RecipeUpdate.find(:all).
reject { |update| update.successor.has_replacement? }.
map {|update| {:type => "Update",
:name => update.successor.title.downcase,
:updates => update_list(update.successor).reverse } }.
each do |update|
filename = "/tmp/#{update[:name].gsub(/\s+/, '_')}_update.json"
file = File.new(filename, "w+")
file.write(update.to_json)
file.close
end

# For every group of similar recipes, dump a JSON document that
# includes the list.
RecipeGroup.find(:all).each do |group|
alternate = { :type => "Alternative",
:name => group.name,
:recipes => group.recipes.map{|r| r.uri.gsub(/\//, '-')} }
filename = "/tmp/#{group.name.gsub(/\s+/, '_')}_alternative.json"
file = File.new(filename, "w+")
file.write(alternate.to_json)
file.close
end
Similarly, this method dumps alternate preparations for recipes:
# For every group of similar recipes, dump a JSON document that
# includes the list.
RecipeGroup.find(:all).each do |group|
alternate = { :type => "Alternative",
:name => group.name,
:recipes => group.recipes.map{|r| r.uri.gsub(/\//, '-')} }
filename = "/tmp/#{group.name.gsub(/\s+/, '_')}_alternative.json"
file = File.new(filename, "w+")
file.write(alternate.to_json)
file.close
end
I move those JSON into a new data seed directory:
cstrom@jaynestown:~/repos/eee-code$ mkdir couch/seed2
cstrom@jaynestown:~/repos/eee-code$ cd couch/seed2
cstrom@jaynestown:~/repos/eee-code/couch/seed2$ mv /tmp/*json .
Then I use my couch_docs gem to load the JSON into my local CouchDB database:
cstrom@jaynestown:~/repos/eee-code/couch/seed2$ couch-docs load . http://localhost:5984/eee
To make sure that everything is working I check the pizza dough recipe, which has been updated many times:



Yup, the updates are working. How about the alternate preparations?



Yup.

Last up tonight, I indulge in a little #yerdoinitwrong by manually copying the seed data over to the beta site. After loading it with couch-docs, I deploy with vlad:
cstrom@jaynestown:~/repos/eee-code$ rake vlad:stop_app
cstrom@jaynestown:~/repos/eee-code$ rake vlad:update vlad:migrate vlad:start_app
With that, I have my new features deployed: an updated recipe and a recipe with alternate preparations.

Wednesday, September 16, 2009

Alternative Seeds, Seeds for Replacements

‹prev | My Chain | next›

First up tonight, a wonderful feeling:
cstrom@jaynestown:~/repos/eee-code$ cucumber features
...
37 scenarios (1 pending, 36 passed)
334 steps (1 pending, 333 passed)
0m40.713s
cstrom@jaynestown:~/repos/eee-code$ rake
(in /home/cstrom/repos/eee-code)

==
Sinatra app spec
..............................................

Finished in 1.558958 seconds

46 examples, 0 failures

==
Helper specs
.......................................................................................

Finished in 0.120922 seconds

87 examples, 0 failures

==
View specs
......................................................................................................

Finished in 1.203112 seconds

102 examples, 0 failures
36 passing Cucumber scenarios with 333 passing steps in them. 235 passing RSpec examples. That's a happy place.

Before deploying my latest features (hiding recipes that have been updated and alternate preparations for recipes), I need to prepare seed data for the new CouchDB database from the legacy Rails application.

The alternate preparations were called "Recipe Groups" in the legacy Rails application:
class RecipeGroup < ActiveRecord::Base
has_many :recipes
end
When last I made seed data, I populated the CouchDB database via Restclient calls from the Rails ./script/console. To get seed data, I then used the couch_docs gem to dump the data to the file system.

This time, I skip the middleman and dump the JSON directly to the file system from ./script/console:
cstrom@jaynestown:~/repos/eee.old$ ./script/console
>> RecipeGroup.find(:all).each do |group|
alternate = { :type => "Alternative",
:name => group.name,
:recipes => group.recipes.map{|r| r.uri.gsub(/\//, '-')} }
file = File.new("/tmp/#{group.name}.json", "w+")
file.write(alternate.to_json)
file.close
end
The recipe updates are a little trickier. In the CouchDB version of the application (and the original XML based version of the application), the list of updates to a particular recipe are stored in a separate update document. In the legacy Rails application, the updates were stored as a linked list:
class RecipeUpdate < ActiveRecord::Base
belongs_to :successor, :class_name => "Recipe", :foreign_key => :successor_id
belongs_to :predecessor, :class_name => "Recipe", :foreign_key => :predecessor_id
end
To obtain the list of updates, first I need the most recent version of the recipe, which I can find by detecting all recipe updates with a "successor" that itself has no successor (update):
RecipeUpdate.find(:all).
reject { |update| update.successor.has_replacement? }
With the most recent version of the recipe in hand, I need to recursively work my way back through the list of previous recipes. I can make use of the Recipe#replacement_for method from the legacy Rails app for that:
>> def update_list(recipe, list=[])
if recipe.is_replacement?
update_list(recipe.replacement_for,
list + [{:id => recipe.uri.gsub(/\//, '-')}])
else
list + [{:id => recipe.uri.gsub(/\//, '-')}]
end
end
Putting the two together, I get the list of updates:
>> RecipeUpdate.find(:all).
reject { |update| update.successor.has_replacement? }.
map {|update| update_list(update.successor)}
...
=> [[{:id=>"2002-12-31-crabs"}, {:id=>"2001-12-31-crabs"}],
[{:id=>"2003-05-26-gravy"}, {:id=>"2002-02-10-biscuits_gravy"}],
[{:id=>"2003-03-30-artichoke"}, {:id=>"2001-10-29-index"}],
[{:id=>"2003-08-17-potatoes"}, {:id=>"2001-09-02-potatoes"}],
[{:id=>"2003-12-24-seafood"}, {:id=>"2001-12-27-seafood_salad"}],
[{:id=>"2003-12-20-soup"}, {:id=>"2002-01-15-gingered_lentil_soup"}],
[{:id=>"2004-05-08-salad"}, {:id=>"2002-01-30-sherry_vinaigrette"}],
[{:id=>"2003-11-25-caesar_salad"}, {:id=>"2001-10-30-caesar_salad"}],
[{:id=>"2003-09-22-soup"}, {:id=>"2002-10-08-soup"}],
[{:id=>"2005-05-29-crabcake"}, {:id=>"2003-04-29-crabcake"}],
[{:id=>"2003-05-26-biscuits"}, {:id=>"2001-09-22-index"}],
[{:id=>"2002-01-29-mashed_potatoes"}, {:id=>"2001-11-06-mashed_potatoes"}],
[{:id=>"2002-05-09-meatballs"}, {:id=>"2002-01-20-meatballs"}],
[{:id=>"2005-04-11-eggplant"}, {:id=>"2002-11-02-eggplant"}],
[{:id=>"2003-10-08-fries"}, {:id=>"2002-05-04-potatoes"}],
[{:id=>"2002-01-19-beef_tenderloin"}, {:id=>"2001-12-10-beef_tenderloin"}],
[{:id=>"2003-02-05-pizza_bread"}, {:id=>"2001-10-30-index"}],
[{:id=>"2003-02-05-balsamic_vinaigrette"}, {:id=>"2002-01-28-balsamic_vinaigrette"}],
[{:id=>"2002-03-28-pizza_dough"}, {:id=>"2002-01-28-pizza_crust"}, {:id=>"2001-12-30-pizza"}],
[{:id=>"2005-06-15-fish"}, {:id=>"2001-11-26-tilapia"}]]
Adding the same to_json here will also dump the JSON seed data.

I will call it a day at this point and pick up tomorrow with actually loading the data into my CouchDB server—and deploying the data and latest features to the soon to be not-beta site.

Tuesday, September 15, 2009

Last Feature Complete

‹prev | My Chain | next›

Time to verify my very last feature.

I add the alternative_preparation helper to my view:
...
%div
= alternative_preparations(@recipe['_id'])


- if @recipe['preparations']
%ul.preparations
...
Dammit. I don't know why I called it "alternative" preparations. So I rename alternative_preparations to alternate_preparations:
...
%div
= alternate_preparations(@recipe['_id'])

- if @recipe['preparations']
%ul.preparations
...
Ah, better.

When last I was working in the Cucumber layer, my alternate links were missing:
cstrom@jaynestown:~/repos/eee-code$ cucumber   features/recipe_alternate_preparations.feature:14
Sinatra::Test is deprecated; use Rack::Test instead.
Feature: Alternate preparations for recipes

As a user curious about a recipe
I want to see a list of similar recipes
So that I can find a recipe that matches my tastes or ingredients on hand

Scenario: Alternate preparation
Given a "Hearty Pancake" recipe with "wheat germ"
And a "Buttermilk Pancake" recipe with "buttermilk"
And a "Pancake" recipe with "chocolate chips"
When the three pancake recipes are alternate preparations of each other
And I visit the "Hearty Pancake" recipe
Then I should see a link to the "Buttermilk Pancake" recipe
expected following output to contain a <a>Buttermilk Pancake</a> tag:

...
And I should see a link to the "Pancake" recipe
And I should not see a link to the "Hearty Pancake" recipe
When I click the "Buttermilk Pancake" link
Then I should see a link to the "Hearty Pancake" recipe
And I should see a link to the "Pancake" recipe
And I should not see a link to the "Buttermilk Pancake" recipe

Failing Scenarios:
cucumber features/recipe_alternate_preparations.feature:14 # Scenario: Alternate preparation

1 scenario (1 failed)
12 steps (1 failed, 6 skipped, 5 passed)
0m0.675s
All of the steps were defined, including building the alternates document, but there was no code to present the links to alternate preparations. Now that I have done all of my inside work, hopefully that step will pass:
cstrom@jaynestown:~/repos/eee-code$ cucumber -s \
features/recipe_alternate_preparations.feature:14
Sinatra::Test is deprecated; use Rack::Test instead.
Feature: Alternate preparations for recipes

As a user curious about a recipe
I want to see a list of similar recipes
So that I can find a recipe that matches my tastes or ingredients on hand

Scenario: Alternate preparation
Given a "Hearty Pancake" recipe with "wheat germ"
And a "Buttermilk Pancake" recipe with "buttermilk"
And a "Pancake" recipe with "chocolate chips"
When the three pancake recipes are alternate preparations of each other
And I visit the "Hearty Pancake" recipe
Then I should see a link to the "Buttermilk Pancake" recipe
And I should see a link to the "Pancake" recipe
And I should not see a link to the "Hearty Pancake" recipe
When I click the "Buttermilk Pancake" link
Then I should see a link to the "Hearty Pancake" recipe
And I should see a link to the "Pancake" recipe
And I should not see a link to the "Buttermilk Pancake" recipe

1 scenario (1 passed)
12 steps (12 passed)
0m0.772s
Nice!

Before calling this feature done, I have a look at the feature in the browser:



That is pretty good, but I am partial to the way we do it on the legacy site:



The indentation make it more obvious that the links belong to the header. Making the "Alternative Preparations" text bold catches the eye a bit more. The CSS that does the trick:
#recipe-alternates {
margin:0 8px;
padding-left:10px;
text-indent:-10px;
}

.label {
font-weight: bold;
}
With that, I am done with my last feature!

(commit)

Where do I go from here? Well first up, deployment. Then, maybe some finishing touches here and there. Starting tomorrow.

Monday, September 14, 2009

The Last Helper

‹prev | My Chain | next›

All right, let's see if I can finish this helper off.

I left myself notes about what was left:
cstrom@jaynestown:~/repos/eee-code$ spec -cfs \   
./spec/eee_helpers_spec.rb
...
alternative_preparations
- should retrieve IDs
- should retrieve titles for each alternate ID
- should return nothing if there are no alternate preparations (PENDING: Not Yet Implemented)
- should return a comma separated list of links to recipes (PENDING: Not Yet Implemented)

Pending:

alternative_preparations should return nothing if there are no alternate preparations (Not Yet Implemented)
./spec/eee_helpers_spec.rb:590

alternative_preparations should return a comma separated list of links to recipes (Not Yet Implemented)
./spec/eee_helpers_spec.rb:591

Finished in 0.361926 seconds

85 examples, 0 failures, 2 pending
The "should show nothing" example is a simple one:
  it "should return nothing if there are no alternate preparations" do
alternative_preparations('2009-09-14').
should be_nil
end
In fact, that example passes right away. It is still a good example to have to ensure that I do not break it with future changes, such as:
it "should return a comma separated list of links to recipes" 
Before implementing that example, I need yet another helper—this one to pull back a list of titles, given a list of IDs. It should query the "titles" CouchDB view and it should return a list of hashes. In RSpec format:
describe "couch_recipe_titles" do
it "should retrieve multiple recipe titles, given IDs" do
RestClient.
should_receive(:post).
with(/titles/, '{"keys":["2008-09-14-recipe","2009-09-14-recipe"]}').
and_return('{"rows": [] }')
couch_recipe_titles(%w{2008-09-14-recipe 2009-09-14-recipe})
end

it "should return a list of recipe IDs and titles" do
RestClient.
stub!(:post).
and_return <<"_JSON"
{"total_rows":578,"offset":175,"rows":[
{"id":"2008-09-14-recipe","key":"2008-09-14-recipe","value":"Recipe #1"},
{"id":"2009-09-14-recipe","key":"2009-09-14-recipe","value":"Recipe #2"}
]}
_JSON

couch_recipe_titles(%w{2008-09-14-recipe 2009-09-14-recipe}).
should == [
{:id => "2008-09-14-recipe", :title => "Recipe #1"},
{:id => "2009-09-14-recipe", :title => "Recipe #2"}
]

end
end
The POSTs in those examples with the JSON payload are an example of how CouchDB allows lookup by multiple keys in it views (such as the title view that is being used here).

Yes, those examples are very much tied to current implementation. I am not writing good regression tests here (that is what Cucumber will do). I am simply trying to drive implementation. Said implementation is nice and clean—pull back the JSON results and map the results into a list of hashes:
    def couch_recipe_titles(ids)
data = RestClient.post "#{_db}/_design/recipes/_view/titles",
%Q|{"keys":[#{ids.map{|id| "\"#{id}\""}.join(',')}]}|

JSON.parse(data)['rows'].map do |recipe|
{ :id => recipe["id"], :title => recipe["value"] }
end
end
With the couch_recipe_titles helper, together with the couch_alternatives, I am ready to polish off the alternative_preparations helper. First up, establishing the context of a recipe with two alternative preparations:
  context "recipe with two alternate preparations" do
before(:each) do
stub!(:couch_alternatives).
and_return(%w{2007-09-14-recipe 2008-09-14-recipe})
stub!(:couch_recipe_titles).
and_return([
{:id => "2007-09-14-recipe", :title => "Recipe #1"},
{:id => "2008-09-14-recipe", :title => "Recipe #2"}
])
end
The two CouchDB helpers return two IDs and two recipes in this context. The first aspect of the helper that I would like describe here is that there ought to be two links:
    it "should have two links" do
alternative_preparations('2009-09-14').
should have_selector("a", :count => 2)
end
To make that example pass, while still ensuring that the no-alternatives case pass, I add a conditional on the return value of the alternate preparations lookup, and then map the recipes associated with those IDs to <a> tags:
    def alternative_preparations(permalink)
ids = couch_alternatives(permalink)
if ids && ids.size > 0
couch_recipe_titles(ids).
map{ |recipe| %Q|<a href="/recipes/#{recipe[:id]}">#{recipe[:title]}</a>|}.
join(", ")
end
end
The last thing that I need this helper to do is to include a label:
    it "should label the alternate preparations as such" do
alternative_preparations('2009-09-14').
should contain("Alternate Preparations:")
end
Making that pass is a simple matter of adding the label:
    def alternative_preparations(permalink)
ids = couch_alternatives(permalink)
if ids && ids.size > 0
%Q|<span class="label">Alternate Preparations:</span> | +
couch_recipe_titles(ids).
map{ |recipe| %Q|<a href="/recipes/#{recipe[:id]}">#{recipe[:title]}</a>|}.
join(", ")
end
end
Phew! That combination of helpers ended up taking me much longer than I originally anticipated, but I think that I finally have it done. Not satisfied with "thinking" that I am done, I will verify that I am done by working my way back out to the Cucumber scenario. Tomorrow.