2011-06-17

Toy URL shortener with Yesod and acid-state

Updated 2011-08-31: Modified urlForm to work with Yesod 0.9.1.

Herein we will create a rudimentary unsafe URL shortener – a toy. We will see how to use David Himmelstrup’s acid-state to persist a data structure and how to use Michael Snoyman’s Yesod to build a simple web application.

This URL shortener is unsafe not in the sense of unsafePerformIO or unsafeCoerce, but in the sense that it is wide open to abuse. If you use it as anything other than a toy The Internet (spammers et al) will have you for breakfast. Jason Jacek has written a good article on the good and bad of URL shorteners; if you are looking for a non-toy URL shortener use one of the products he recommends.

This post is literate Haskell; just copy and paste it into a .lhs file and build/run with GHC.

Preliminaries

Language extensions, imports, and all that. There’s just no way around them!

> {-# LANGUAGE DeriveDataTypeable #-}
> {-# LANGUAGE MultiParamTypeClasses #-}
> {-# LANGUAGE OverloadedStrings #-}
> {-# LANGUAGE QuasiQuotes #-}
> {-# LANGUAGE StandaloneDeriving #-}
> {-# LANGUAGE TemplateHaskell #-}
> {-# LANGUAGE TypeFamilies #-}
> import Control.Exception (bracket)
> import Control.Monad.Reader (asks)
> import Control.Monad.State (gets, put)
> import Data.Acid
> import Data.IntMap (IntMap, Key, empty, insert, lookup)
> import Data.SafeCopy
> import Data.Text (Text)
> import Data.Typeable
> import Prelude hiding (lookup)
> import System (getArgs)
> import Yesod hiding (Key, insert, Update, update, get)
> type URL = Text
> -- type Key = Int is provided by Data.IntMap.

State

We first create an appropriate (for some definition of appropriate) data structure for storing URLs. Then we will give that data structure ACID (Atomicity, Consistency, Isolation and Durability) guarantees by way of the acid-state package.

URL storage data structure

We’re going to keep things real simple and stuff our URLs into an IntMap using sequentially increasing keys. We also keep track of the latest key assigned.

> data URLStore = URLStore Key (IntMap URL)

Now we create the API we will use for our URL store. We need to be able to create an empty store, add URLs, and retrieve the URL corresponding to a given key.

> -- | Create an empty store.
> emptyStore :: URLStore
> emptyStore = URLStore 0 empty
> -- | Add an URL to the store. Return the key to the URL together
> -- with the updated store.
> addURL :: URL -> URLStore -> (Key, URLStore)
> addURL u (URLStore n m) = let k = n+1 in (k, URLStore k (insert k u m))
> -- | Retrieve an URL from the store.
> retrieveURL :: Key -> URLStore -> Maybe URL
> retrieveURL k (URLStore _ m) = lookup k m

Nothing fancy going on so far, just the plain everyday Haskell you know and love!

Make our URL store ACIDic with acid-state

Now we’ll take our vanilla URL store and make it ACIDic using acid-state. First we’ll need Typeable and SafeCopy instances.

> deriving instance Typeable URLStore
> $(deriveSafeCopy 0 'base ''URLStore)

We also have to “port” the add/retrieve API to be ACIDic.

> -- | Add an URL to the ACIDic store. Return the key to the URL.
> add :: URL -> Update URLStore Key
> add u = do
>   (k, store) <- gets (addURL u)
>   put store
>   return k
> -- | Retrieve an URL from the ACIDic store.
> retrieve :: Key -> Query URLStore (Maybe URL)
> retrieve = asks . retrieveURL

Finally, let some magic happen and, ta-da, we have our ACIDic URL store! In particular this creates Add and Retrieve data constructors that will be used below as proxies for add and retrieve.

> $(makeAcidic ''URLStore ['add, 'retrieve])

The Yesod web app

All that is missing now is the web app that will use our ACIDic state. We’ll use Yesod to for that but only scratch the surface of all the good stuff Yesod can do (including persistence).

Seth Falcon has written a survey of URL shortening service APIs. Our API will is very basic: POST a href parameter to / to get a short URL (a GET will also do the job, I’ll let convenience trump idempotency here).

First we define the Yesod foundation type URLShort and make it an instance of Yesod. The foundation will have to carry its state.

> data URLShort = URLShort { state :: AcidState URLStore }
> instance Yesod URLShort where approot _ = ""

Then we define routes (paths) we will use. #Key will match any leading integer and discard trailing junk.

> mkYesod "URLShort" [parseRoutes|
> /      RootR      GET POST
> /#Key  RedirectR  GET
> |]

URL Shortening

The route handlers for / do the URL shortening. The only difference between a GET and a POST is where the query params come from.

> getRootR  = lookupGetParams  "href" >>= doRootR
> postRootR = lookupPostParams "href" >>= doRootR

When given a single URL store it and display its key and a HTML form for submitting another URL.

> doRootR [url] = do
>   acid <- fmap state getYesod
>   key  <- update' acid (Add url)
>   defaultLayout $ do
>     addHamlet [hamlet|<a href=@{RootR}#{key}>@{RootR}#{key}|]
>     addWidget urlForm  -- Is this "The Right Way"?

Otherwise display only the form.

> doRootR _ = defaultLayout urlForm

Here is the HTML form (we don’t bother with fancy formlets).

> urlForm = [whamlet|
> <form action=@{RootR} method=post
>   <input type=text name=href
>   <input type=submit value=Shorten!
> |]

Redirecting

The route handler for /[0–9]* retrieves the appropriate URL and redirects the client. If no URL is found for the key the client is given a 404.

> getRedirectR key = do
>   acid <- fmap state getYesod
>   url  <- query' acid (Retrieve key)
>   case url of
>     Just u  -> redirectText RedirectPermanent u
>     Nothing -> notFound
>   return ()

The return () at the end is there to help GHC infer a suitable type of getRedirectR.

Applicationification (main)

Our main runs the yesod application with its AcidState. The port number to run on must be provided. Bad arguments crash (I already mentioned that this is a toy, right?). A bracket ensures that the AcidState is checkpointed and closed on exit.

> -- Provide port number as argument.
> main = do
>   port <- fmap (read . head) getArgs
>   bracket (openAcidState emptyStore)
>           createCheckpointAndClose
>           (warpDebug port . URLShort)

Thanks to the magic of acid-state our URL store is persistent across executions. Go ahead, try it out!

Exercises

You can move this URL shortener a teenie bit further from toydom and closer to tooldom by implementing one or more of the following:

  • Use denser representation than decimal digits;
  • Don’t store duplicate URLs (reuse old key);
  • URL preview;
  • JSON, XML, and raw text content;
  • Sanity-check submitted URLs;
  • Filter spam URLs;

From there move on to authentication, statistics, and so on…

Thanks for reading!

Flattr this

3 comments:

  1. Nice example. For the getRedirectR case, I would recommend either giving an explicit type signature, or sticking a "return ()" at the end of the do block.

    ReplyDelete
  2. Thanks Michael, I updated the post with your "return ()" suggestion. Yesod's types are for another day... ;)

    ReplyDelete
  3. To build with the recently released Yesod 0.9 change "hamlet" to "whamlet" in the definition of "urlForm".

    ReplyDelete