Sunday, 29 September 2013

Annotations in Scala

So, I've been playing about with some ideas for writing this REST framework I just mentioned this morning. And I've been playing with doing it in Scala, because - well - I like Scala even if I don't always get on with it. And I've been working on doing it with Annotations, in the same way that Spring MVC works, because I quite like that and it's nice and clean and easy to understand. It's kind of like the DSL techniques that I've seen done elsewhere, but without introducing any magic into the system that makes it difficult to understand how it works - it's simply just classes and methods that have annotations on them. And it also happens to make it easy to extend for other uses - you just add more annotations and it just works. This makes the Swagger integration trivial to add on later.

It's worth putting a quick note in first - the documentation on this is all marked very clearly with an Experimental tag, so it's no surprise that it's not rock solid yet. I'm sure that in a release or two it will be fantastic, but it's just not there yet.

However, Annotations are far from a first class citizen in Scala. I'm not sure they even count as second-class citizens from what I've been seeing. Actually writing an annotation in Scala is easy. You simply write a class that extends from StaticAnnotation and it just works. (Not quite the whole story, but you can work out the rest yourself I'm sure). That's where Easy ends though.

Actually introspecting an object at runtime to get the annotations on it is possible, but it's not trivial. It took me a fair bit of hacking away to get at the following:

This works, and the annotations variable contains a list of all the annotations on the class. However - and I don't understand this yet - it doesn't always work. I wrote this in a unit test, and I found that the unit test passed about 90% of the time, without actually changing anything at all! That's scary, and has totally put me off of using Scala annotations for now.

Instead, I've just re-written my annotation in Java, and using it is as trivial as using any Java annotation even when I'm annotating a Scala class. The actual use of a Java annotation is virtually identical to using a Scala annotation - though the Scala annotations do offer more flexibility in that they are full-on classes and so I assume they can contain their own behaviours if you want. I'm not sure where that would ever be useful though to be honest...

REST Frameworks

I keep on coming back to this topic, in a variety of different languages, and I keep on having the same problems. I can never find a framework for writing REST first services that actually works how I want it to work, and will scale to a reasonable sized application.

One thing that I find remarkably common is frameworks that want to define every single route in the entire application in the same place, instead of some structure for how these are defined. This works well enough in small applications, where you have a small enough number of routes that you can manage this, but by the time you get to a couple of dozen routes - which really isn't that many when you sit down and think about it - it starts to get unwieldy.

So - in my ideal REST framework, which I might end up actually writing myself at this rate - these are the features that I would look for, in no particular order:

  • Sane route management. I quite like the way that Grape works here, where you define handlers in a class that are all related, and then you mount these classes on specific endpoints. So, to take the traditional blog example, you would have one class that provides all of the handlers related to posts and mount that on /post. You would then have a different class that provides all of the handlers related to users and mount that on /user. This keeps things nice and separated out, whilst still being easy to manage.
  • Support for selecting routes based on more than just the URL path and the Request Method. Selection based on the presence and value of parameters would be necessary to support OAuth 2.0 - as both "Resource Owner Password Credentials Grant" and "Refreshing an Access Token" are defined to be on the /token endpoint, but with different values for the grant_type parameter. 
  • Separate out the output generation from the handler mapping. It's surprisingly common to see frameworks that insist on only one output supported, or on requiring different handlers to produce different outputs. What I'd like is one where you have a handler for a given resource that is written once, and then you can say that you convert the output of this into JSON by doing this, or into XML by doing that. All you then need to do is make sure that the handler produces enough information to support all of the required outputs, and make sure you can write the output mapping routines correctly.
  • Sensible way to handle response headers and status codes. Sometimes you actually want to send back a status code that isn't a 200 - 201 and 204 being obvious candidates here, but all of them make sense. And sometimes you have specific needs to support sending back headers of some kind. These headers are actually more complicated because there are two categories they fall into - general ones or request specific ones. You might decide that you always want to send back an X-Application-Version header containing the version fo the software that's running, but you might only want to specify a Link header that is specific to this one resource. These two use cases probably have two different ways of being supported.
  • Sensible way of supporting error handling. Exceptions make sense here, but this means that there needs to be a way of handling an exception and producing an error document to send to the client. Ideally the way that the output from the exception is handled should be the exact same as how the output from the handler is handled - including status code and header generation as well as the actual content to send back. There are again two different ways that exceptions need to be handled - specific to this request or general across everything.
  • Sensible version management. If you're writing a REST API, then it's a fair assumption that you expect people to use it. If you then make changes to the API, you need to be able to do so in a way that doesn't break existing clients. This means versioning of the API itself. I've seen a number of different ways to achieve this - some better than others in my opinion - but it would be nice to have a way that is pluggable into the framework instead of it enforcing one specific way of doing it. That's probably asking a lot though, so I'd be happy with a sensible strategy in place - something like the Accept-Version header.
  • Sensible support for authentication. There are a number of fairly standard ways to authenticate a REST API these days - OAuth comes to mind here - but not everybody wants to do it that way. The ideal way is to have a pluggable mechanism whereby the request can be processed before the handlers get to it to determine if the request is authenticated, and if so who the user is. This should then automatically fail and not get to the request if the authentication isn't valid, or if the request requires authentication and it isn't present.
I also quite like the idea of a framework that has support - if not built in then pluggable - for things like Swagger, which make documenting and testing your API really nice and simple.

On top of all of this as well, I've noticed a growing trend in the JVM world to write your own server first - this is how things like Play work. I'd much prefer a scenario where you build a WAR file and deploy it to any WAR container as you prefer. 

Tuesday, 14 May 2013

C/C++ Build System - Dependencies

So, I'm currently struggling with how I work out what a dependency actually means for my build system. It's a lot harder to work out these things in a C/C++ environment than in a Java one!

Firstly, there's the easy part. A dependency can imply a number of other dependencies. For example, any time you depend on boost-filesystem you also need boost-system. It seems silly to always have to add these two dependencies yourself, when you could have it worked out automatically for you. I know some people dislike transitive dependencies like this, but I think that as long as they can be managed then they are a good thing because they keep your build cleaner.

Secondly, you have the requirement to set up Linker and Compiler flags. These are where it gets a lot more complicated, and you have to be careful what you actually set up.

Compiler Flags come in a few different forms, some easier to handle than others. These can be categorized as follows:

  • Macro Definitions. Essentially any -D flags (Or as appropriate for the compiler) that need to be passed in for the code to compile correctly
  • Include Directories. These are all of the -I flags (Or as appropriate for the compiler) that need to be passed in for the header includes to be found. These will come in two flavours, depending on whether the dependency is in the same project as this one or not. However, the end result is the same - a compiler flag so that the correct header files can be found
  • Everything else. This is the difficult one, but also the one that is less likely to be needed. These flags are going to be different for each compiler, and are going to be compiler dependant. Running pkg-config --cflags for every pkg-config script on my system reveals that the only extra flag is -pthread. That is hopeful at least.
Linker Flags come in a similar set of forms as Compiler Flags, again some easier than others to handle. These are as follows:
  • Directories to find Libraries in. These are all of the -L flags (Or as appropriate for the linker) that need to be passed in for the libraries to be linked against to be found. This has the same issue as the Include Directories, with regards to if a directory is in the same project or not. 
  • Libraries to link with. These are all of the -l flags (Or as appropriate for the linker) that need to be passed in for the libraries to be linked with correctly. These can be difficult, because different systems do different things here. On a Linux system, for example, the shared library needs to be present in the same place at runtime as it was at compile time otherwise the loader can't find it. I don't believe Windows has this problem as long as the DLL is available on the path. There are ways around this, but they are a bit messy. 
  • Everything else. Again this is difficult for the same reason as Compiler Flags, but unfortunately it seems to be more widely used. On my system again, pkg-config --libs lists 5 distinct parameters that come under this category, not all of which I'm sure of what they do without reading documentation. These are -pthread, -rdynamic, -r:, -Wl,--export-dynamic and -Wl,-R. The -Wl ones appear to be broken, as they are parameters for gcc to tell it to pass the rest as a parameter to ld, when you can't always guarantee that gcc is going to be called as a linker instead of calling ld directly.
Finally, there's the complications of working out how to get these settings. The easy case is to have them hard coded in the dependency file, but that gets brittle when you need to work across different systems. Arguably the best option is to support tools like pkg-config, except that not all libraries do support that so that becomes problematic. I'm not sure yet what the best answer is for this...

Saturday, 11 May 2013

C/C++ Build System - Simple Configuration

So, I'm writing up what I think would be a decent start for a configuration file for my new build system. It's very inspired by Maven, so the configuration is very similar. I make no excuses for that.

This is an example configuration file that describes a simple project that builds an executable application, and depends on

  • Boost Filesystem 1.49+
  • Boost ASIO 1.49+
  • NCurses 5+

Ideally it will be able to tell that Boost 1.53 - which is the current version - is greater than 1.49 and so will be acceptable, and ideally it will be able to be configured to know that a certain version is too high and isn't suitable.

Because of the difference in how C/C++ projects work to how Java ones work, it will be the case that the dependency definitions are available but the dependencies themselves need to be installed on the system already.

I'm also thinking - because it works well for Maven, and makes sense to me - to have dependency definitions be accessed over the internet from a central repository. That way, you don't need to ensure that you have them on your system in the correct directory for them to work...

C/C++ Build System

I've talked before - here and elsewhere - about the state of C and C++ build systems. And it's not good. Basically, right now, the ecosystem comes down to a whole bunch of disparate tools that are all difficult to use, and that pretty much all involve using a code-driven build script in one way or another (Be it m4 - e.g. Autotools, python - e.g. Waf, or some custom language - e.g. CMake)

So, I'm toying with the idea of writing my own. Probably a stupid idea, and probably one that will never go anywhere, but we'll see. I'm currently playing with the idea of a Maven style build system, where you write configuration file(s) to describe the build and the build system understands that and does the work. This will likely work, similar to Maven, in a plugin-oriented way to make it easy to extend the build system if necessary, but so that you always write your actual build scripts in a declarative configuration file instead of in code...

Watch this space :)

Saturday, 2 March 2013

JSR 303 Bean Validation with Spring

I have to work out how to do this far too often, so I'm writing it down. It's actually really easy, but there's always something that catches me out. So here it is.

Firstly, the dependencies. I'm using Spring 3.2, so the dependencies you need are:

  • org.springframework:spring-context
  • org.hibernate:hibernate-validator:4.3.1.Final - runtime scope is all you need
  • javax.validation:validation-api:1.0.0.GA
As long as all of those are available, the only thing you need to do is define a bean in your Spring context of type org.springframework.validation.beanvalidation.BeanValidationPostProcessor. This works whether you are using Spring WebMVC or standalone, which has the important side effect that it works in unit tests.

If you do this, then you can start to add all the fancy @NotNull and similar annotations to your beans, and Spring will ensure they are actually valid when the context is loaded.

Tuesday, 1 January 2013

CDParanoia Output

So, I've been playing around of late with something to do with ripping CDs. Specifically a tool to make it very easy to rip Audiobook CDs to put onto an MP3 player, but that's not hugely important.

What I've been trying to do is to use the "cdparanoia" tool to actually rip the CD, and then once ripped it can be converted to the appropriate final format using something like lame. Now, cdparanoia has some useful features for wrapping in a script, including a more useful output format that can be parsed. Unfortunately, as best as I can tell, this output format is never documented anywhere! As such, I've finally given up and gone to the source to work out what it means. And here it is...

The output consists of many many lines similar to:

##: 0 [read] @ 175224
##: 1 [verify] @ 0
##: -2 [wrote] @ 1175

Which isn't especially self explanatory... The actual output is:

##: %d [%s] @ %ld

Where the values substituted in are:
%d -> A number indicating the function being performed
%s -> The name of the function being performed
%ld -> The position being worked on.

The bits of interest here are the function being performed, and the position being worked on.

The function being performed is a number between -2 and 13, as follows:
  • -2 -> wrote
  • -1 -> finished
  • 0 -> read
  • 1 -> verify
  • 2 -> jitter
  • 3 -> correction
  • 4 -> scratch
  • 5 -> scratch repair
  • 6 -> skip
  • 7 -> drift
  • 8 -> backoff
  • 9 -> overlap
  • 10 -> dropped
  • 11 -> duped
  • 12 -> transport error
  • 13 -> cache error
I'm not too sure what all of these actually mean, but the important ones are fairly obvious. These are all taken from the strings table in the code, so are the actual mappings between the function ID and the function name in the output. As such, everywhere you see -2 you will always see [wrote], and so on.

The next part of concern is the progress itself. This is represented by the position value, but not in any way that directly relates to any other numbers you can easily get out. In the CD I'm trialling this with, track one  goes from sectors 0 -> 20139, but the position values are all over the place and mostly significantly higher than this. And there is no obvious connection between the numbers, which doesn't help at all.

Again, reading through the code, you eventually discover how this works. The current sector is the position value divided by CD_FRAMEWORDS, where CD_FRAMEWORDS is half CD_FRAMESIZE_RAW and CD_FRAMESIZE_RAW is 2352. This means that if you take the position value and divide it by 1176 - which is half of 2352 - then you get the sector number.Furthermore, the progress bar that is output shows the sector number calculated from the last position value that we saw from a Verify (1) or Wrote (-2) function, and ignores all of the other ones. This is thus the sector that was last written to disc and not the last sector read. As proof of this, for the CD I'm playing with, the very last output line for a function of Wrote had a position of 23683463, which works out to be sector 20138.9991496599

There's then a whole lot of logic surrounding exactly which smiley face to show, how the progress bar line gets written and so on, but most of that is stuff that external programs don't need to care about.