Ruby Driver GridFS API Now Cleaner and Faster
If you have been waiting to try out GridFS with Ruby, your last excuse just went away. A week back, Kyle Banker pushed a new version (0.19.1) of the Ruby driver with a completely new GridFS API.
Ruby File-like API
For those that had not used the old API, it was nearly identical to the Ruby File api, which I have also always thought was a bit clunky as well. If you like that style, you can still use it with the new GridFileSystem class:
db = Mongo::Connection.new.db('testing')
grid = Mongo::GridFileSystem.new(db)
grid.open('mr_t.jpg', 'w') do |f|
f.write File.read('mr_t.jpg')
end
grid.open('mr_t.jpg', 'r') do |f|
puts f.read
end
That works fine and operates on the notion that you always use a unique path to your file. In the example above, you would probably not use mr_t.jpg and instead do something like /images/mr_t.jpg.
New More Simple API
What I love best about this release though is the new API that Kyle has provided. It is as simple as get, put and delete.
db = Mongo::Connection.new.db('testing')
grid = Mongo::Grid.new(db)
# returns object id
id = grid.put(File.read('mr_t.jpg'), 'mr_t.jpg')
# get the file using object id
file = grid.get(id)
# read the file from grid fs
puts file.read
# delete the file
grid.delete(id)
Joint and rack-gridfs
The good news is that you can take advantage of this right away using a MongoMapper plugin named joint and some rack middleware. I forked both of the projects and updated them to work with the new API, opting for the simple get, put and delete methods with ObjectIds.
Joint is as simple as the following:
class Asset
include MongoMapper::Document
plugin Joint
attachment :file
end
Similarly, rack-gridfs requires minimal setup:
gem 'jnunemaker-rack-gridfs', '0.3'
require 'rack/gridfs'
use Rack::GridFS, :hostname => 'localhost', :port => 27017, :database => 'test', :prefix => 'gridfs'
With this middleware in place, you could simply upload an image into the Asset model (if using joint) and then link to it in your templates:
<img src="/gridfs/<%= asset.file.id %>" alt="<%= asset.file.name %>" />
This style of storing files mixed with a touch of caching is really awesome and pretty close to what we are doing in Harmony.
Conclusion
Not only is the new API more aesthetically pleasing, it is also quite a bit faster. Writes are about 4x faster and reads are about 2x faster according to Kyle. Funny how simple code leads to faster code.
5 Comments
Mar 10, 2010
This is seriously cool. Nice work!
Mar 10, 2010
Hey John,
Maybe you’ve gone into this elsewhere, but can you fill us (the uninitiated) in on the advantages of storing files/assets in the DB as opposed to uploading them to S3 and pointing to them in the DB? Specifics on how this is helping you out in Harmony could be helpful as well.
Cheers,
Alex
Mar 10, 2010
I second Alex’s question. What are the advantages of storing images in a database vs S3 or a filesystem? Can you talk a bit about how the caching mechanism for this might work? What speed improvements do you see? I’d love to start storing images along with the rest of my model data…just want to make sure there are real advantages to this approach.
Looking forward to lots more mongo,
Sam
Mar 10, 2010
@Alex Kahn: The Mongo team as covered the advantages of storing files in this manner. I’ll quote below as well:
@Sam: Currently, we just do a simple page cache and expire that cached file when the file changes. We will be moving to Varnish an HTTP cache pretty soon though so we can go to multiple app servers easier.
Mar 19, 2010
I recently made a fork of Paperclip where I merged several forks and refactored a bit. Included a storage option for GridFS (not my implementation) which should hopefully work with MongoDB GridFS (not tested).
This would make it easy to use MongoDB for file storage in Rails, fx when deploying an app to Heroku where there is no write access to an "old-fashioned filesystem.
Sorry, comments are closed for this article to ease the burden of pruning spam.