2009
03.25

Music for me and (some of) you

I started using last.fm about a week ago. So far, it’s done a good job of “discovering” music that’s similar to what I like. One artist that kept popping up was named Cargo Cult and the songs were damn good. So, I checked them out at Magnatune. I’ve shopped at Magnatune before and what I really like about Magnatune is the transparency. They tell you how you can and cannot use the music (which most record companies simply tell you what you cannot do with it,) they disclose how much they pay the artists for album sales, they let you pick how much you want to pay for the albums, they use non-DRM encumbered formats so you can listen to your music however you like…Magnatune is all kinds of goodness.

Another benefit of buying from Magnatune is they encourage limited sharing of the music by allowing you to give 3 copies of what you bought away to friends. So, I bought Cargo Cult’s album Alchemy and can give it to 3 others. Preview the songs at Magnatune and if you like what you hear, let me know and I’ll send you a link to download the album.

2009
03.25

New, new and new

New year (well…3 months into it), new code and a new design. Things were getting stale and I was not very happy with the version of WordPress I had installed…so I upgraded. WordPress still doesn’t do everything I want, however. I did find a couple plugins that aim to provide some of the functionality I want, namely a built-in wiki system. One was poorly documented and didn’t seem very well thought out. The other only allows you to create wiki pages on the admin-side of WordPress – not at all what I want. So, I’ll keep looking and testing.

2008
12.30

Space – for everyone!

SpaceX is cool. No, really. Using their Falcon 1e rocket, you can launch a 1010 kg payload into orbit for a mere $9.1 million. Makes me want to build a satellite just thinking about it.

2008
11.17

OpenSQLCamp 2008 impressions

I was able to attend OpenSQLCamp 2008, even while under the influence of DayQuil, jetlag (spent the week in Santa Clara), 3 hours of sleep and a 2 hour drive. Then there’s folks like Arjen who flew across 15 times zones and was still coherent – I guess I’m just not cut out for lots of travel in short time spans. Oh well.

The un-conference was great. You didn’t have the distractions of vendors with big obnoxious displays promising the sun/stars/moon, sales droids trying to peddle wares, marketing puppets who still think vendor lockin is a great strategy – just some really smart folks getting together, sharing what they’ve learned and collaborating. People from MySQL, Drizzle, Postgres and SQLite were there.

Brian Aker’s keynote was insightful. I liked the sticker on his laptop: “My other computer is a data center.” Vadim’s session on the Percona patchset helped illustrate what patches they include in MySQL and why. A particularly good side-conversation in the session was Brian highlighting a need for someone to pick up management of the InnoDB code. Since InnoDB is developed behind closed doors, someone can step up and put up a mailing list so the community can coordinate 3rd party patches into a single, cohesive code base. provide for a community-driven time-line of features, fixes, etc. The Oracle/InnoBase developers would be most welcome however their lack of communication in the past has necessitated this.

Arjen’s OurDelta session explained what the OurDelta project is about and how it works behind the covers. For me, this highlighted, in my mind, how MySQL is becoming fractioned due, in part, it MySQL’s/Sun’s development strategy. With such long time lines between releases, many patches containing fixes and features are put out there by the community and different people are bundling the patches and releasing new versions. OurDelta and Percona are the prime examples. They don’t always include the same patches in their releases. In all honesty, calling OurDelta or Percona’s distributions “MySQL” is misleading and leads to confusion. They are derivitive works based on MySQL’s code – forks.

Jay Pipe’s Join-Fu talk was entertaining. Besdies learning what really annoys Jay (the phrase “chaps my ass” was used a lot,) I learned a bit about the internals of MySQL that I can directly apply. I did not know that stored procedures were compiled and cached on a per-connection basis, not across all connections. This can have serious performance implications.

Dr. Richard Hipp’s talk on How SQL Database Engines Work was good. It revealed the good and bad of retrieving data from indexes and data tables. After the talk, one is almost left wondering how in the world databases work at all given the how extremely complex data retrieval can get. Yay smart people!

Peter Zaitsev filled in for Piotr Biel for the Sphinx talk. I had heard Patrick Galbraith discuss Sphinx during a MySQL Webinar on memcached UDFs a few weeks ago so I was just refreshing what I had already learned.

Kelly McDonald presented their work on developing a complete auditing system in Postgres for all database activity (or activity on the tables they were interested in.) Auditing systems can be annoying to develop so it was good to see how they did it. With a little C code and some triggers, they were able to identify all data changes down to the end user submitting the changes, including the web session – all vital for debugging problems in today’s complex web applications.

Finally, I sat in on the Lightning talks. Giuseppe Maxia gave a demonstration of the MySQL Sandbox. I’d heard about it but was impressed with the simplicity it affords. Ronald Bradford put out feelers for MySQL monitoring (what people use, what people should use, what should get monitored, etc.) Baron showed how to create snapshots using LVM for purposes of backing up MySQL. There were others but my head was quite fuzzy and sleepy with the onset of the Food Coma from lunch.

Hackathon. Sadly, I was feeling too crappy to drive back to Charlottesville to attend the hackathon on Sunday.

Thanks to Baron for dreaming this up and actually setting the wheels in motion. If you missed it, you missed out.

2008
10.22

Source code maddness

I want to read code. I want to test patches. I want to contribute. With all the different source code managers out there, it’s a royal PITA. To illustrate, I need to know rcs, cvs, Subversion, git, Mecurial and Bazaar and that’s just the beginning. Many of the development tools I use only support CVS or Subversion – integrating other SCMs is possible but it usually takes a certain amount of duct tape. Many SCMs have web interfaces that allow you to browse the tree but it’s no substitute for a genuine SCM client.

Trying to remember all the different commands to pull code, generate diffs, push code, branch, rename, log notes, update, etc. across the different SCMs is enough to make you want to pull your hair out at times. What would be a God-send would be a “universal” SCM client that understands the different back-ends, while presenting a unified command set to the user. Such a client should hide all the SCM-specific crud but provide a means to let those who know the ability to use the SCM-specific features when and where they need it.

2008
10.09

Truth in Broadcasting

I was watching a college football game a weekend or two ago when one of the announcers made probably the most revealing statement I’ve ever heard. He said “[university] has a hard time getting good players because their academic standards are so high.” This can only be punctuated with two stories, told to me by the people who were there.

 Story 1

At a University computer testing center, a proctor asked a student, who came in to take a computerized test, for their ID. It’s standard practice. The student pulled out the University magazine, pointed at the cover showing the star basketball center going in for a lay-up and said “that’s me.”

Story 2

An assistant professor was about to hand out the final test for the semester when he noticed someone he’d never seen in class before. “Excuse me, but who are you?” he asked. The student looked up and replied “I’m the nose tackle for the football team,” expecting that his response would be sufficient for anyone around campus. After a few minutes of back-and-forth conversation, the professor handed him a test. Needless to say, the professor was pressured to pass the student, even though he had never attended class and was given special permission to write a paper to “make up” for any missed class work – a paper the professor helped the student to write.

There stories sound made up but they were told to me by the proctor and the assistant professor, not by a second-hand witness. When I heard the 2nd story, I felt cheated. In my view, the value of the education I was sweating and tearing my hair out to pay (to the tune of over $10,000 a year) dropped considerably.

2008
10.07

Presentation results

My presentation, entitled “MySQL for non-DBAs” went well…considering the very broad topic and that the laptop I was going to present from decided to have a hissy fit. It was my work-sanctioned XP Pro laptop and the registry decided to corrupt itself, causing it to reboot over and over and over and…. Oh joy! Luckily, the group organizer allowed me to borrow his Mac and since I had the presentation on a USB drive (OpenOffice.org format, of course) we were rockin’ and rollin’ in no time.

I took the “MySQL 5.0 for DBAs” course last year. That was a 5-day course that used over 220+ slides. I compressed over three quarters of that content down to about 40 slides for a 2-hour talk with some live demo work thrown in for good measure. It was a tough presentation to put together given such a large amount of information. I tried to highlight the design, the engines, the commands, etc.. I relied on audience questions to act as tangents to go a bit deeper into specifc areas. Overall, I think it went well.

My presentation is available as PDF and OpenOffice.org and licensed under Creative Commons Attribution-Share Alike 3.0. Additionally, the presentation was recorded but I haven’t seen it posted on the NoVaLUG site yet.

2008
09.25

Presenting at NoVaLUG

I volunteered to give a presentation on MySQL at the next NoVaLUG (http://www.novalug.com) meeting, Oct 4th. Topics to include: installation, configuration, administration, replication, backup/restore, architecture, monitoring and troubleshooting. I won’t be deep-diving on any particular topic (too much for the time slot) but, instead, focusing on the basics.

Should be a fun time.

2008
08.15

Two things this week

Ok…Apache needs to do a better job at documenting how to build the software and what the defaults are. I’ve been building Apache for years, including the painful days of Apache 1.3 when you had to patch the module into the Apache source tree (I don’t want to hear about your “back in Apache 0.99 we had to hand edit the core.c to add modules, uphill, both ways” pains.) So, yesterday, I was setting up a server where I want to use Apache and the mod_proxy modules to front two Jboss servers for some experimentation. I downloaded the latest tarball, unpacked it and did a “./configure –prefix=/usr/local/work –enable-mods-shared=all” – foolishly thinking that all modules would get built as shared DSO objects. I mean, doesn’t “all” mean “everything?” I found that I was wrong.

After doing a “make && make install” I went and checked the fruits of my labor. Examining the contents of the modules directory revealed <drum roll> no mod_proxy modules. I didn’t even see mod_ssl. So I checked the config.log and saw that mod_proxy and mod_ssl were disabled. Shock. Surprise. Much swearing. Apparently, “all” does not mean “all” but rather “some.” I went back and added “–enable-proxy=shared –enable-sll=shared” to my ./configure command line and rebuilt.

If you’ve Linux/Unix long enough, you’ve come across the mknod command. It’s for creating special files and device files. One very useful special file is called a pipe, which is a simple queue. A process can write to the pipe while another can read whatever is written to it. This allows for multiple processes to share data using simple file handling code instead of more complex networking code.

Now, a pipe is a First In, First Out structure. There exists another, similar mechanism called a stack. This is a First In, Last Out structure – also called a Last In, First Out structure. It’s very handy. What would be even handier would be the ability to create a “stack special file” like you can a pipe using mknod.

2008
07.28

ElasticDrive – yay or nay

Had one of those “interesting” emails in my inbox this morning. It was in regards to ElasticDrive, a product that allows you to use Amazon’s S3 service to provide a backing store for a remote file system. Now, remote file systems are nothing new, they’ve been around for decades. What makes ElasticDrive interesting is it makes the remote storage appear as a block device, allowing you to format it with the file system of your choice or even use it in a software RAID configuration.

Now, iSCSI is a protocol that presents remote storage as a block device. It needs to be said that iSCSI was designed to do this. ElasticDrive works over HTTP to connect to a virtually unlimited amount of storage – just add more S3 buckets and you’ve increased your storage. HTTP was *not* designed to do this. ElasticDrive uses 64kb buffers behind the file system. So, if you use the ext3 file system, with its default 4kb buffers, ElasticDrive will not commit the data to S3 until the 64kb buffer is full. So while your system is happily throwing 4kb buffer chucks at it, you’re skating on thin ice.

Now, granted, I haven’t downloaded or used ElasticDrive, but I don’t have a whole lot of confidence in it, especially when Amazon’s S3 goes down for 8 hours. Also, Amazon has no guarantees on the performance of its S3 service. And since all traffic is going over the Internet, which has no performance guarantees, I wouldn’t want to put any real data on it – data critical to my business.

I could be wrong. ElasticDrive may be a great product. It may function excellently and without error. Unfortunately, it isn’t going to be a cheap option. The software itself costs $1 per gig of space. That’s on top of what Amazon is going to charge you. They charge per gig of storage, per gig of data transfer in and out of their S3 network and they charge for the HTTP requests to read and write data. All that can add up real quick.