March 02, 2011

Posted by John

« older newer »

A Few ObjectId Tricks

One of the things that I was not aware of until recently is how handy MongoDB’s object ids actually are. Below are a few tips based on some things I have been doing lately.

To and From Strings

First off, it is quite simple to switch back and forth between object id and string, which is useful in JSON/XML serialization and in finding Mongo documents from params.

id = BSON::ObjectId.new
# => BSON::ObjectId('4d6e5acebcd1b3fac9000002') 
id.to_s
# => "4d6e5acebcd1b3fac9000002" 
BSON::ObjectId.from_string(id.to_s)
# => BSON::ObjectId('4d6e5acebcd1b3fac9000002') 

Generation Time

Switching back and forth between strings is simple and obvious. Something a little more interesting is that pretty much every driver supports extracting the generation time from an object id. This means that you can stop using created_at in your Mongo documents and instead just pull it from the object id. We are doing this in Gaug.es quite often.

BSON::ObjectId.new.generation_time
# => 2011-03-02 15:01:08 UTC

The generation time is UTC and you can easily use ActiveSupport’s awesome TimeZone stuff to move the time into different zones.

id = BSON::ObjectId.new
id.generation_time.in_time_zone(Time.zone)

Miss created at? Just make a simple method that wraps this and returns the generation time in the current zone.

def created_at
  id.generation_time.in_time_zone(Time.zone)
end

Date Range Queries

Where it gets even more interesting is that you can also create object id’s from time. This means you can use object id’s in range queries, say to find out how many people signed up for your sweet new app today.

sites = Gauges.db['sites'] # get Mongo::Collection
start_at = Time.zone.now.beginning_of_day
time = BSON::ObjectId.from_time(start_at)
sites.find('_id' => {'$gte' => time}).count

Above I just did a count query, but you could actually iterate the documents or whatever as with any normal Mongo query. The benefit is that _id is automatically indexed, so you do not have to have an extra field plus an extra index.

Nothing earth shattering here, but I have found these really helpful on Gaug.es and thought I would share.

Labels: Features

11 Comments

  1. John F. Miller John F. Miller

    Mar 02, 2011

    What kinds of security risks are then in putting ObjectId’s out over the wire?

  2. @John: Not sure I follow. It is just an id, what is the security risk in exposing that in any way?

  3. I think John Miller means that having access to an object’s creation time is the security risk.

    Developers could be using object ids in URLs or HTML, but may not want people to know the creation time of those records.

  4. @Zef: Ok, personally that has never thought of that as a security risk.

  5. John F. Miller John F. Miller

    Mar 24, 2011

    RE: security – The ObjectID can tell you when it was created, what machine, which process and in what order. I guess after reading about some of attacked against other protocols that were leaking this kind of information, I tend to double check everything I send to a user. I will admit, I cannot see any useful attack that could be mounted by knowing, for example, when a User registered using their ID.

  6. Ryan Downing Ryan Downing

    Apr 28, 2011

    @John: Great post and info. I’m wondering if range queries would be affected, i.e. correct, if you’re dealing with ObjectIds that have been generated across multiple machines. Have you tried this? Is the ObjectId generated from a timestamp machine agnostic?

  7. This is great! Just shaved 33% of one collection’s space by getting rid of updated_at and created_at! We are using MongoDB to log user’s actions so we have a lot of them.

    Thanks!

  8. @Randy: Sweet! Glad it helped.

  9. @Randy: I’m intrigued ; how did this help you get rid of the “updated_at” field ?

  10. For security concerns, you can encrypt/decrypt the ObjectId with a 2-way encryption using a private key. I’m not sure if MongoDB supports this kind of feature but it worths implementing it yourself if you’re concerned about the sensitivity of your dates.

  11. Thanks for the post. My concern about serializing ObjectIDs is performance. I’m using the C driver and going back and forth between the binary array and hex string requires quite a few of instructions. I’m new to mongo so I’m exploring my options. I think it makes sense to create a unique id field as a unique string.

Sorry, comments are closed for this article to ease the burden of pruning spam.

About

Authored by John Nunemaker (Noo-neh-maker), a web developer and programmer who has fallen deeply in love with Mongo. More about John.

Syndication

Feed IconMongoTips Articles - An assortment of news, howto's and thoughts on MongoDB.