Archive

Archive for August, 2010

Pre-Mid-Life Crisis?!

August 30th, 2010 2 comments

I think I’m going through a pre-mid-life crisis. The thought hadn’t crossed my mind at all until a friend of mine brought up the idea on my birthday. I was telling her how for my birthday the wife got me a package at DriveTech Racing School. We haven’t booked a date yet, but when we do go, I’ll be receiving some basic instruction and 12-laps around a 1.5-mile racing track in an actual Nascar car. With a V8 at my feet and the bleachers flying by at over 100 mph, we’re hoping this will address my need for speed. Very few things make me happier than punching the accelerator, negotiating turns in high speed, and gripping the wheel tight, and simply seeing the world zip by.

I then went on to tell my friend about my then upcoming motorcycle training class. I spent this past weekend under the hot sun on a motorcycle. This was my first time actually driving one (I’ve been a passenger on one before). Let me tell you – riding a motorcycle is a lot harder than you’d think. Firstly, having to switch gears is a stark change from the cushy life of automatic transmissions that I’ve gotten used to. Then getting used to completely new controls (clutch at the hand, shift lever at the foot, brakes at the hand and feet, throttle at the hand) takes a while too. I’m used to braking and applying gas with my feet, it’s not quite the same with the hand. Then finally there’s turning and maneuvering – leaning a 300-pound motorcycle in a sharp turn going at speed isn’t exactly as comforting as doing it in a car – 2 wheels are very different to 4. All things considered, I had a blast and am glad I did it. At the end of the day I passed both the road test and the written test and so will soon have a motorcycle endorsement on my license (once I get down to the DMV that is). Now all I need is a motorcycle. I’ll have to start shopping around and learning as much as I can about buying one!

Finally there’s the sky diving that I have also signed up for. I’ve always talked about jumping out of an airplane and now I’ll finally have the chance to do so. Again, we haven’t scheduled it, but we’re thinking of going in October. I’ve heard stories and seen videos and it looks absolutely thrilling. I’m very much looking forward to free-falling with my limbs spread out, the world (or at least a few acres of it) beneath me, and the adrenaline pumping. Hopefully it’s not too cold by mid-October. But then again, who cares – I’m going to be jumping out of an airplane!

So, that’s it – the evidence of my pre-mid-life crisis. I’m certainly not “mid-life” yet, but getting a motorcycle license, spending time with a Nascar car on a racetrack, and jumping out of an airplane sounds a lot like a mid-life crisis to me. Add to that the recent job situation changes, which have me questioning a lot of what I’ve done over my years, and I think I have all the symptoms.

On the bright side, now that my mid-life crisis is out of the way, it’s smooth sailing from here…

… right?

Categories: Life Tags:

All done…

August 16th, 2010 5 comments

Well, it’s over… CFA results came out this morning, and I passed. Believe me, I’m glad to be done with the last of the three CFA exams. Furthermore, with pass rates that are less than 50%, I’m proud to be able to say that I was able to pass each one on my first attempt. However, just like last year, when I passed level 2, my excitement, though present, is somewhat muted. Though I am done with the exams, I still won’t earn the three golden letters, CFA, until later this year. In addition to passing the three exams, the CFA Institute requires that candidates have 4 years of experience that they must approve as “investment decision making”. Being a career changer, that means a lot of my software background won’t count, leaving me a few months shy. So, I won’t earn my charter until the end of the year.

As I look back at the CFA journey, which began for me in 2006 (I took two years off as we moved around the country), I conclude that though it is a time-consuming commitment, I highly recommend the CFA program for anyone looking to make a career in investments. The subjects covered by the exam are both broad and deep (arguably too deep in some instances like swap valuation), giving candidates very good, holistic insight into the field. I especially like the way that things tie together – the first exam covers all the fundamentals of economics and accounting, the second covers valuation and international concepts, and the third ties it all together with portfolio concepts.

My congratulations to all those that passed any level this year, and good luck to those that’ll be taking it in the future! Now, off to celebrate…

Categories: Life Tags:

Matlab woes…

August 11th, 2010 No comments

WARNING: The following rant is somewhat technical. It’s a departure from the economics/finance of quantitative equity into the actual technical implementation and the challenges that are often faced. If you don’t know what threading, serialization, and parallel computing is you’ll probably want to stop reading right here!

I use Matlab to analyze the gigabytes of equities data I have to put on trades. Last week I’ve run into a frustrating brick wall with Matlab that I can’t seem to get around. Hopefully this helps someone else in a similar situation.

Until a couple of weeks ago, it would take 8 hours for my program to simulate 20 years of trading activity and spit out results. Clearly this is annoying since if I want to change a quantitative factor or two and see how things would look, I have to let it run for 8 hours! At this point, I was fully utilizing Matlab’s parallel computing features (which actually helped considerably – without them I would have seen a run time of 20+ hours!).

I had the idea of implementing some kind of caching scheme to speed up the data retrieval from a heavily normalized database. That took a bit of programming to implement, but is now working like a charm. Run time was reduced from 8 hours to about 3 hours. That’s a great improvement, but if we could get faster still, that’d be great.

To understand where the bottleneck was, I killed of one factor at a time and measured how runtime was affected. In the end, I found that killing all factors (so for each month my program essentially connects to the database and disconnects), made an inconsequential difference. That is, the majority of the time spent is in the database connect/disconnect (something I, as a former programmer, should have known quite well!).

So, the solution: reuse the database connections (in the technical world, known as connection pooling). So, I spent many hours investigating Matlab’s database toolbox and its ability to pool connections. To make a long story short, Matlab’s database connectivity (build on Java JDBC drivers) doesn’t support connection pooling. I suppose if I were running it in an application server (Tomcat, something else…), I may be able to work. But, I have neither the environment, expertise, nor desire to go down that path.

So, I decided to implement my own pooling. To do this, I wrote a Java connection pool manager. That is actually a simple task since all a connection pool manager is, is a vector of connections and the logic to reuse a connection if it is idle. Having never done Matlab/Java integration before, it took me a bit to figure out how to make it all work, but in short order I was there. All set, right? No…

I changed my code to use the connection pooling rather than native Matlab database connections. However, it blew up due to the parallelization. What I didn’t realize going in is that when Matlab parallelizes code, it actually serializes objects, splits up the code across multiple threads, and then hydrates the objects in each thread. Unfortunately, the Matlab database object doesn’t support serialization. Thus, there was no way to pass a connection object into some parallel code. If I choose to eliminate the parallel code, then I can use connection pooling, but everything will run synchronously in a single thread – eliminating the need to even use multiple connections, and much of the performance boost I’ve seen so far.

I also learnt along the way that a Matlab database object isn’t the same as the underlying JDBC database object. Matlab encapsulates the JDBC equivalent in some custom structures. Once I realized this, I thought that maybe I could get around this issue by using JDBC objects directly rather than their Matlab equivalents. However, the JDBC connection objects also don’t support serialization, bringing me back to a dead end.

So, I’m left with a frustrating situation:

  1. The only way to use parallelization is to have a connection pool pass a connection into the parallel code. Having multiple connection pools (one for each thread) is silly and is no better than where I am now. So, in order to have a single pool, but have the parallel code use the connections, the connection must be passed to the parallel code, which requires serialization and is not supported by JDBC connections (and thus Matlab connections).
  2. I could skip parallelization altogether and run everything single threaded: this’ll likely take me back to 20+ hours!
  3. Leave things they way they are: the parallel code creates a connection, works with the data, and disconnects. This totals a 3 hour runtime, but seems to be the lesser of all evils.

Alternatives:

  1. Rebuild an environment that uses some kind of application server that provides the connection pooler. I’m not even sure what this would look like, how Matlab would play in the environment, and the necessary support tasks involved in maintaining the environment. This is probably the least attractive alternative.
  2. Use .NET connections in Matlab since .NET provides connection pooling natively when using SQL server. This sounds like a great option, but a .NET connection isn’t a JDBC connection and so I can’t use the already existing Matlab database infrastructure. I’ll have to rewrite all database operations like reading, updating, etc.
  3. Scrap Matlab altogether and switch to a (real) programming language like Java or C# (most like the latter in my case). This will allow me to use connection pooling without having to do a bunch of custom Matlab database connectivity work. Of course, I’m not sure how this will compare in terms of execution speed for large array operations (which Matlab is notoriously good at).

Conclusion: for now I think I’ll leave things as they are. I have “bigger fish to fry” than the details of the simulator implementation. Besides, I’m not sure how much all of the work I put into alternatives 2 or 3 will save me in terms of run time. If I go from 3 hours to 2.5, the exercise seems pointless. If I can get from 3 hours to under an hour (which is what I think would happen), it may make sense. Either way, this’ll have to wait until I have nothing else to do and feel like rewriting all my infrastructure.

End: rant.