Trials and Tribulations of Computers and Sciences

Friday, December 27, 2013

Streaming binary file from http response up to AWS S3 with Node.js

Recently, I found myself needing to take images from a remote source and copying them to an Amazon S3 bucket to be used by a client website. Initially, the task seems simple enough... for each image:

Send GET request to remote source for image
Write response onto webserver as file
Read file and upload to S3

Although each image would simply overwrite the previous one on the webserver, there really isn't a need for any additional disk writing to the webserver, since the end goal is only to copy the images up to S3:

Send GET request to remote source for image
Forward response to S3

The concept is fairly trivial, but still took me a good amount of time to nail down due to a bit of a quirk I wasn't aware of with the Node.js Request module. The modules used:

The biggest gotcha here was that the request module expects a String in the response body by default. To make sure the response stream is kept as raw bytes, the request should be considered to use null as the encoding . Related code:

I've only tried this so far with images, but it should work with just about any small file. Unfortunately, I am uncertain about how well this works with a large file, where memory issues may arise. Obviously, the code will need to be tweaked to fit your needs.

Friday, October 26, 2012

INTELLIsense? I think not...

I am ashamed. These blog posts are getting few and far beyond and certainly shorter and shorter. However, here's a short and sweet tip for those of us that work on occasion with MS Sql Server.

IntelliSense is often useful for getting at those pesky tables with long names -- if not for the auto-complete function, at least verifying that certain tables or columns exist certainly helps. However, this is predicated on the fact that the IntelliSense cache is not being.. cachey. For example, when a table structure is changed and for one reason or another, IntelliSense refuses to acknowledge the existence of, let's say, a new column. What can we do? Refresh the cache of course!

There are two ways to refresh the cache:

Go to Edit -> IntelliSense -> Refresh Local Cache and
Hit Ctrl+Shift+R

Maybe I'll start doing some basic posts about jQuery, CSS and basic DOM navigation in the future. But, until next time...

Source: Dan's Blog

Thursday, June 14, 2012

When there is no time for Sametime

Short entry here but possibly useful for anyone who needs to use Lotus Notes but never uses any of the Sametime features. As someone fits the above criteria, I found it to be quite the irritation that, despite my efforts to disable Sametime, the tray icon nevertheless insisted on making an appearance. As it turns out, to remove the icon from the tray while your Notes client is launched, you have to edit a *.ini file. Here are the instructions:

1. Edit the plugin_customization.ini file in Notes : at c:\Program Files\lotus\notes\framework\rcp\plugin_customization.ini or c:\notes\framework\rcp\plugin_customization.ini

2. Add the following line: com.ibm.collaboration.realtime.application/useSystemTray=false

Once this is done, close Notes and launch it again. This should work.

I actually found my plugin_customization.ini file at c:\Program Files (x86)\IBM\Notes\framework\rcp\ but it's expected that there's some varying between systems (naturally). Anyway, following these steps worked like a charm for me, so it should for you too, making your system tray just a little bit cleaner.

Source

Wednesday, December 14, 2011

Hale Aloha CLI: Part 2: Revenge of Hash: The second coming v2.0

Previously, on "Trials and Tribulations" - Command-line interface hale-aloha-cli-hash was tapped for inspection (episode recap). Critics (me) of Team hash's command-line phenom raved about the tool, saying that it "does indeed accomplish a useful task" and that the team "did an admirable job". But the tool was not only met with gold stars and sunshine as the very same critics (still me) mercilessly pointed out that the application was "not without its flaws".

This episode is brought to you with no further commercial interruption by yours truly.

Strangely enough, although we still refer to Team hash as the trio of developers who were responsible for the genesis of hale-aloha-cli-hash, a twist of fate intervened and alas, the project that I had extensively reviewed previously suddenly landed within my domain of responsibility. Due to the nature of the Hale-Aloha-CLI project, the rationale behind Team chair suddenly taking over the hash project was sound. Often, a piece of software is written with a certain approach, design and frame of mind. These concepts, however, often belong exclusively to the originators of the software. In open-source development, it is important to keep new developers in mind, and since the objective behind the Hale-Aloha-CLI project was to create a user interface which was modular enough for new developers to add additional commands, it made good sense for our two teams to switch our development focus to each others' projects.

Although I had performed a fairly in depth review of hale-aloha-cli-hash in my previous entry, it became obvious that a hands-on experience was going to reveal so much more as soon as I began to start implementing a new command. Before I go any further, here is the project page for hale-aloha-cli-hash and you'll note that the system now accepts 3 new commands: set-baseline, monitor-power and monitor-goal.

All in all, the design of the system was pretty intuitive, all the code for execution of commands were together and all the code for argument checking were found together and so forth. The biggest challenge, initially, was fully understanding the thought process of the previous developers. There was a Java interface in the project, which I initially heralded as being uniquely dissimilar to the hale-aloha-cli-chair implementation. Unfortunately, upon further investigation, I found that the developers may have misinterpreted the reasoning for the interface to exist in the first place.

For some of you out there, it may behoove you to know that an interface acts sort of like a blueprint. It lays out the basic frameworks for building something, in this case a Command. After adhering to the basic frameworks, it then becomes the developer's prerogative to embellish and add functionality to the Command to make it unique and useful, as long as the basic framework remains the same. Interfaces, in terms of Java, define methods that have to exist for every class which implements it. However, in hale-aloha-cli-hash, the interface defines every version of existing commands and the only class which implements the interface is responsible for executing the different commands. What this means is that the interface becomes a blueprint that is used only once which, in this case, defeats the purpose of drawing one out in the first place. Secondly, due to this fact, every time a new command is added to the interface, it has to also be added to the execution class. It quickly became a little unwieldy to keep track of all the changes that had to be made to existing code in order to add a single command.

However, the real challenge was again in working with group members. It is certainly understandable that most members, if not all of us, had other big responsibilities looming on the horizon but it wasn't until the project deadline was starting to peak over the mountains to cast its oppressing rays upon the group that our lines of communication really fired up. I tried to pick up one of the more unique commands: monitor-power as my primary issue for version 2.0 of hale-aloha-cli-hash because it involved a bit of thread use (which I was interested in) and it served as a takeoff point for the monitor-goal command. What I found out soon after, however, was that monitor-goal was even more so dependent on the existence, or at least an abstract conceptual mock-up of the set-baseline command. Due to the division of labor that we had settled on as a group, this meant that if one command was delayed, it had the potential to delay another command from being made as well.

This was especially evident as the set-baseline command was not fully incorporated into the system until recently, and thus the monitor-goal command had be stalled in its production a bit. In the umpteenth hour, however, my team and I were able to roll out a working product. Given the fact that we were building our system on top of a previous developer's system, I thought that we did a tremendous job overall of adapting our mindsets to accommodate for the different design. One of the few downsides to this, I found, was that I caught myself essentially assimilating the style of the previous group, and was digressing from my own coding style. This is not necessarily a bad thing, but being a chameleon when dealing with poorly written code can prove to be a problem in the future.

I do not mean to boast here but to simply make an observation based on the facts as I perceive them. The previous developers of hale-aloha-cli-hash created a system that did satisfy the three Prime Directives of software engineering, but I strongly believe that hale-aloha-cli-chair's design was more intuitive for developers. Of course, this is not to say that hale-aloha-cli-hash no longer meets the third Prime Directive, but that we may have taken the third directive more to heart in our inception of hale-aloha-cli-chair. If nothing else, this should be a learning experience for everyone in that while it is important to have a functional product, you can never spend enough time mulling over and designing the architecture of your system from the ground up.

In the end, I am still a huge fan of Issue Driven Project Management because I believe that under the right circumstances, it can complement and expedite the development process tremendously. Despite some of the communication issues and some last ditch efforts to get things working, IDPM still did help our collaborative efforts along as a group in the end. The experience of taking over hale-aloha-cli-hash was one of the most unique experiences in my software engineering career and it was invaluable in the lessons it taught me and although the project is not without its bugs and flaws (some existing from the previous development team, while many others were introduced by our current team), I feel pretty good about the project experience overall and I look forward to a fruitful future of similar challenges!

As a final cautionary note, this may be the season finale of "Trials and Tribulations" but rest assured that I am fighting for the networks to renew it for many more to come!

Thursday, December 1, 2011

Hale Aloha CLI Spotlight: Hash

My previous blog was clouded by frustration, and I admit that I left out crucial components. For example, for an entry about my experiences with Issue Driven Project Management, I all but neglected to touch upon what IDPM entailed at all. Even more to my shame, I hadn't mentioned that our project was called hale-aloha-cli-chair even once. I promise to try and make up for my previous deficiencies here while attempting to be concise enough to get to the point of this entry: a technical review of our sister (brother?) project, hale-aloha-cli-hash (You'll note, of course, the similarities in the project names).

Both the hale-aloha-cli-chair and hale-aloha-cli-hash projects set out to accomplish the same goal: a Java based implementation of a Command Line Interface which provides its users a functional way of interacting with WattDepot energy data from the Hale Aloha student residence towers on the campus of the University of Hawaii at Manoa. In this implementation, both our teams set out to design an open source system which has six commands built-in: help, quit, current-power, daily-energy, energy-since and rank-towers. For a more in depth description of the commands' functionality, I present to you the project page for hale-aloha-cli-chair.

If you'll recall from my first blog entry, the all-encompassing goal of open source software boils down to a useful application for users and a higher level of effectiveness in allowing developers from anywhere contribute with relative ease. Thus, I reiterate here once again, for your convenience, the three Prime Directives of open source software engineering:

Prime Directive #1: The system successfully accomplishes a useful task.
Prime Directive #2: An external user can successfully install and use the system.
Prime Directive #3: An external developer can successfully understand and enhance the system.

However, whereas the previous review of the open source project aTune was a mere glance at a well-established application, I am now in an unique position where I am able to test and review an early version of hale-aloha-cli-hash to which I will have access to all of the quality assurance tools the hash team was using as well as a general insight into their goals as I analyze their source code and architecture in depth.

I would be remiss, however, if I didn't do my due diligence and lay down a better picture of the general approach to the project that both the chair and hash teams took towards the Command Line Interface implementation. To be brief, the project consists of energy usage by students in the Hale Aloha towers being recorded by sensors which communicate its data to a WattDepot database which allows the Command Line Interfaces to retrieve relevant energy data to display to the user of the CLI.

OK, you caught me, those are not WattDepot servers, but look how happy that user is with the CLI!

To learn a bit more about how WattDepot can be interacted with, see my previous blog entry regarding my initial experience with WattDepot. From a user's perspective, this is a more than sufficient information to begin using the CLI. However, from a developer's standpoint, there are a few more objectives in the CLI implementation. First of all, both of these CLI project are managed by an Issue Driven Project Management system and are under Continuous Integration.

For your experienced developer, these concepts may not be a shock at all, but for those of you who are new to software development or are simply interested in educating yourselves, both of these processes are tools which aid a development team in achieving efficient collaboration. Issue Driven Project Management, or IDPM, is a means of managing a project by creating simple, uncomplicated issues which can be resolved in a short span of time. What this allows a project team to do is to create a large quantity of small tasks for its team members to draw upon so that they will always have something to work on at their pace. Since the issues are quickly resolved, IDPM also requires that the team members put their heads together frequently to generate more issues so that their resources (read: developers) are never idle. In supplementing IDPM, there are a number of automated quality assurance tools that are required for successful development of the CLIs as well. These assurance tools allow for a concept called continuous integration.

When multiple developers are working on a project concurrently, there is bound to be an integration issue. Even though developers may be working on separate issues, they may end up modifying something similar and conflict with one another's work. What continuous integration does is not to prevent these conflicts from happening but to make certain that the conflicts can be quickly resolved. If developers are committing their changes to the shared repository with a higher frequency, the amount of modification to overcome when a conflict occurs is going to be smaller. This means that in order for continuous integration to work, changes made by a developer must automatically be checked for errors by quality assurance tools and each developer must make small but frequent commits to the repository.

Armed now with this barrage of conceptual knowledge, I invite you now to cruise alongside me as I walk through my technical review of hale-aloha-cli-chair's ~~sister~~ ~~brother~~ sibling project: hale-aloha-cli-hash. Keep in mind, once again, that the big picture is to make sure that the hash team was able to satisfy the three Prime Directives of open source software engineering.

Prime Directive #1: The system successfully accomplishes a useful task.

To answer this question, I first went to the hale-aloha-cli-hash project page where I was able to download the latest distribution of the project.

I first ran through the system using correct input values:



E:\Downloads>java -jar hale-aloha-cli-hash.jar

Connected successfully to: http://server.wattdepot.org:8190/wattdepot/



>

 help

Here are the available commands for this system.

current-power [tower | lounge]

Returns the current power in kW for the associated tower or lounge.

daily-energy [tower | lounge] [date]

Returns the energy in kWh used by the tower or lounge for the specified date (yy

yy-mm-dd).

energy-since [tower | lounge] [date]

Returns the energy used since the date (yyyy-mm-dd) to now.

rank-towers [start] [end]

Returns a list in sorted order from least to most energy consumed between the [s

tart] and [end] date (yyyy-mm-dd)

quit

Terminates execution

Note: towers are:  Mokihana, Ilima, Lehua, Lokelani

Lounges are the tower names followed by a "-" followed by one of A, B, C, D, E.

For example, Mokihana-A.

>

 current-power Ilima

Ilima's power as of 2011-12-01 was 27.439735792375984 kW

>

 daily-energy Ilima 2011-11-30

Ilima's energy consumption for 2011-11-30 was: 664.6056753929341 kWh

>

 energy-since Ilima 2011-11-25

Total energy consumed by Ilima from 2011-11-25 to 2011-12-01 is: 3951.9046231694

37 kWh

>

 rank-towers 2011-11-25 2011-11-30

Enter dates as YYYY-MM-DD

Older data may not be valid.

Command was not processed, please check for correct arguments

>

 rank-towers 2011-11-25 2011-11-30

Source              For the interval 2011-11-25, 2011-11-30 energy consumption by tower was:

Mokihana-06-telco   126 kWh

Lokelani-08-telco   131 kWh

 ... (truncated for the blog entry's sake)

Ilima               3601 kWh

Lokelani            4172 kWh

>

 quit

Of the six commands the CLI is supposed to have implemented:
> help: provided a helpful message explaining the usage.
>current-power: implemented and functional.
>daily-energy: implemented and functional.
>energy-since: implemented and functional.
> rank-towers: implemented but for gave errors at times even with correct input. Also ranks towers AND lounges instead of just towers, but this is more of a design choice than a problem.
>quit: exited the system cleanly.

All of the intended functionality of the project were implemented and working. Although the rank-towers command did not always properly execute, the system is still overall useful and it is fair to say that, although it is not without bugs/flaws, hale-aloha-cli-hash does indeed accomplish a useful task.

Prime Directive #2: An external user cans uccessfully install and use the system.

In order for the project to allow an external user to successfully install and use the system, there ought to be some features about both the project itself and the project homepage which provides guidance for the average user who may not know much about software development. There were several questions I asked myself in order to determine if the hash team was successful in conveying this user-friendliness.

> The project homepage provides an overview of the purpose of the application and briefly lists the available commands.
> There is a User Guide wiki page featured on the project homepage which describes how to download, install and execute the system. The wiki also lists the available commands again, and a more detailed description of each command.
> The downloadable distribution includes an executable jar (the execution of which was seen up above) which means the user doesn't have to know how to compile and build the system in order to use it.
> The system version number is clearly attached to the project distribution and users can easily use this to keep track of the which evolution of the system they are using.

In addition to the valid inputs above, I also tested the system with invalid inputs.



E:\Downloads>java -jar hale-aloha-cli-hash.jar

Connected successfully to: http://server.wattdepot.org:8190/wattdepot/



>

hello world

Invalid Command

>

HELP

Invalid Command

>

current-power

Invalid Number of Inputs

>

current-power foo

Command was not processed, please check for correct arguments

>

current-power 1 2

Invalid Number of Inputs

>

daily-energy foo bar

Error occured while running daily-energy.

Command was not processed, please check for correct arguments

>

daily-energy Ilima foo

Error occured while running daily-energy.

Command was not processed, please check for correct arguments

>

daily-energy Ilima 11-25-2011

Error occured while running daily-energy.

Command was not processed, please check for correct arguments

>

daily-energy Ilima 2012-11-25

Error occured while running daily-energy.

Command was not processed, please check for correct arguments

>

daily-energy Ilima 2011/11/25

Ilima's energy consumption for 2011/11/25 was: 548.7930279141283 kWh

>

current-power foo

Command was not processed, please check for correct arguments

>

rank-towers 2011-11-25 2011-13-25

Enter dates as YYYY-MM-DD

Older data may not be valid.

Command was not processed, please check for correct arguments

>

exit

Invalid Command

>

quit

Judging from the reasoning behind the invalid inputs, it was clear that in general, invalid inputs are broken up into three categories: unrecognized commands, incorrect number of arguments and everything else being simply errors occurring along with a helpful hint of what the user may have done wrong to illicit the error. Surprisingly, the CLI accepted my input of 2011/11/25 which is not necessarily a bad thing. The only reason this raises an eyebrow is because the hash team simply chose to disregard the the delimiters so an input such as 2011111225 (perhaps due to an inexperienced typist) will still be accepted and there may be some combination here which gives the incorrect date. However, here is the rundown of invalid inputs I tried:

> Completely invalid command was recognized as an invalid command.
> Upper case version of an available command was recognized as an invalid command.
> Invalid number of arguments (too many or too little) handled well and lets users know of the problem.
> Invalid arguments handled well but generally provides a general message as to why an error occurred (this is, again, more of a design choice than a problem).
> Strange delimiters for dates will be accepted as long as they are one character long.

Given that hale-aloha-cli-hash provided users a simple documentation of how to get up and running and continued to guide them throughout the usage of the CLI, I would say that the hash team did an admirable job of satisfying the second Prime Directive. As an external user of this application, I believe I would be capable of following the User Guide wiki and understand what the CLI is capable of from both the documentation and the built-in guidance that the application provides.

Prime Directive #3: An external developer can successfully understand and enhance the system.

Here is where the distinction between open source software and other software really shows up. This CLI implementation as well as our own does not only need to be useful to users and usable by users, but also expandable and modifiable by other developers. The goal of this project is to create a CLI with 4 built in commands for interacting with WattDepot energy data. However, it's also very important to note that, since WattDepot has so much other data available, it would be prudent to allow the CLI to be extensible with more commands. In determining whether the CLI created by the hash team does indeed allow for an external developer to easily enhance the system, I asked myself several questions to this end as well.

First, I investigated the project's homepage for some insight into the project and for hints on how to get started.

> A Developer's Guide wiki page was featured on the front page.
> The guide provided clear instructions on how to build the system from source files.
> The guide indicated the quality assurance standards being followed by the project.
> The guide provided some tips on how a developer might try to adhere to those quality assurance standards.
> The guide provided references to coding standards being followed.
> The guide mentions that the project is issue driven and explains how to partake in the project through creation of issues.
> The system is under Continuous Integration and the guide provides a link to the CI server associated with the project.
> The guide also explains how JavaDoc documentation can be easily generated.

Since the wiki page was very well done and provided a good insight into the development process, I felt confident with taking the next steps towards contributing to the project. I was able to check out the sources from SVN (recall my previous entry explaining the advantages of SVN) and generate the JavaDoc documentation for my perusal very easily. Regarding the JavaDocs themselves, I gained even further insight into the project.

> From viewing just the JavaDocs, I was able to gain a decent understanding of the system's architecture and the its components.
> For the most part, the naming scheme of the components were very intuitive and clearly indicated their underlying purpose with the exception of the existence of both a Node and Nod class. I was able to ascertain what a Node class probably does, but it seemed like the Nod class performed roughly the same duties. The name of the class being "Nod" was also a bit odd but this was just a small problem that I could certainly live with.
> The system is designed to support information hiding and uses the protected keyword so restrict access of certain methods to only those within the package. This is likely done to facilitate JUnit testing but also offers other advantages like being extensible.

After generating the JavaDocs, I was then able to build the system and generate test coverage information regarding the system using JaCoCo. Using this information and a review of the testing source code, I was able to determine that the existing set of the test cases were implemented rationally and does a good job of preventing new developers from making enhancements that can break the pre-existing code. In this analysis process, I was also able to gain a further insight into the hash team's reasoning behind certain design choices. In addition to reviewing the test case source code, I also reviewed the actual implementation source code as well. To this end, I found several more good qualities about the hash team's code.

> Java coding standards were followed.
> Comments were sprinkled throughout the code appropriately, giving new developers clues and explanations regarding existing code.
> The code is easy to understand and the comments were neither too sparse nor too overwhelming.

Since the project was under continuous integration, I also checked the CI server associated with the project to draw some conclusions about the consistency of the hash team's development process.

> Although there were build failures, indicating that a developer may have failed to verify the build prior to committing to the repository, the failures were always corrected promptly.
> The system was worked on in a consistent fashion, and commits from developers were done very frequently.
> Most of the commits made to the repository were associated with an appropriate issue, with the exception of a few quick fixes here and there.

Finally, although this does not really have a huge relevance (it is relevant, just not hugely so) on the third Prime Directive, I was curious to see how the system was contributed upon by its three individual developers. To this end, I reviewed the team's project page and analyzed the Issues page associated.

> It was evident which parts of the system were worked on by each developer.
> Based on the Issues page, it is apparently who I need to talk to if I had a question regarding a certain part of the system or its behavior.
> Even with just a simple glance at the Issues page, it becomes glaringly obvious that one particular developer contributed significantly less than the other two. Whereas developers BrysonYHori and mitchell.kupfer seemed to have implemented the majority, if not all of the system, developer macmillan.johnw contributed only two fixes to the project which, based on the Issues, were relatively minor and done in a single day's time frame.

Although hale-aloha-cli-hash is not without its flaws here, it is clear that a strong effort was made from the inception of this project to create a modular system which is open to external developers. From the project's homepage to the source code documentation and even the team's own striving for excellence in adhering to their coding standards, it is clear that this project is one that satisfies the third Prime Directive of open source software engineering. It is a shame that of the three developers on the team, there was an evident imbalance in contribution and effort but, once again, that does not factor into my analysis of the project's adherence to its goals at all.

Hale Aloha CLI CHAIR Project Page
Hale Aloha CLI hash Project Page

Tuesday, November 29, 2011

Trials and Tribulations of Computers and Sciences

I'll be frank. This post is not going to be pretty, elegant or overall very fun to read. This is not your fault, readers. I am not punishing you. In fact, this particular entry is being written as a response to a software development project that I was recently and currently involved in. Unfortunately, until 7 hours ago, the project was, more or less, not fully implemented and thus there was a bit of difficulty in gathering my final thoughts on the ultimate experience I wanted to convey.

To be brief, the project's inconsistency most likely stemmed from a difficulty in facilitating an effective line of communication between our team members, likely due to the fact that we are all current college students with too much on our plates and too little time in the day. However, this ultimately resulted in quite a bit of "last minute heroics". I have to admit, I am largely to blame. At the inception of the project, I was full of optimism and may have unwittingly assumed a role of leadership I was not necessarily willing to be in. As a result, as my initial barrage of Emails missed their marks, I let my own failure to "rile up the troops", so to speak, get to me. As a result, my own enthusiasm for the project diminished significantly and now I sit before you in confession of having given up even before the "last minute heroics".

Overall, my experience of issue driven project management was hectic. In theory, it sounds like an amazingly streamlined and intuitive way of solving the project management problem. However, in practice, if project members are unable to communicate effectively, the entire process completely deteriorates. This is, of course, not the problem of the design of the concept but rather a problem with the people trying to take advantage of the process. At times, it seemed like a big hassle to create an issue just to work on something but, again, this problem arises only due to a lack of consistent group meetings.

The command-line interface our group has implemented is fully functional and includes four commands for the users to interact with the WattDepot data energy at the Hale Aloha towers on the UH campus. Our design allows for the project's main execution to only create a processor, which then essentially takes over. In retrospect, having the main method exist outside of the processor is probably unnecessary, but in the spirit of keeping things modular, this was the design choice we made. The processor then explicitly "gathers" commands in a HashMap. Although I say "gathers" here, our implementation actually requires that we hard-code the existence of these commands into the processor. However, barring adding a couple of lines to the class, the system is modular and adaptable to accept additional commands.

Although I feel like the overall quality of the project is pretty good and is something I can reflect on and take some pride in, I am certain that, in the past 7 hours (6 of which I was asleep for), there was a significant amount of testing done to the completed project as a whole. Additionally, I found it really troublesome to write JUnit test cases for many of the classes I implemented, due to functionality conflicts like void methods or the infinite loop which drives the command-line interface. I was able to get fairly decent test coverage on the classes by making them more modular and testable but I am still very reluctant to say that the tests were "good". This perception does not, however, factor in ease of development for new developers, ease of use for users or even if the project can be successfully distributed because, again, despite having waited to the last minute, I ran out of time to fully test these requirements.

Tuesday, November 8, 2011

Introduction to WhatDepot?

In my last post, I intended to bring to everyone's attention the energy situation in Hawaii and how, in the midst of a unique problem, there are unique solutions. I also touched upon the fact that there exist numerous opportunities for a software developer to contribute as part of these solutions, especially in terms of data analyses.

As a concept, preparing a system for energy data analysis seems like a trivial task. Install a power meter, read that meter at some interval, store that data into a database and then query the database. However, when the time comes for implementation, it becomes a very involving task indeed. Luckily, we live in a time where open source applications are numerous and it's often easy to find an open source project that suits your needs, even if only partially so. Shortly after my previous entry, I learned of an open source web service which collects electricity data and stores it in a database. The service, aptly named WattDepot, even features a robust API for accessing the collected data. WattDepot can even be installed locally for experimentation or simulations and is capable of near real-time feedback. Of course, the best part of WattDepot might be that it's open source, and thus free to use for anyone who is interested in taking advantage of its many features.

But, as with almost all open source projects, there is certainly a bit of a learning curve involved. When I first tried my hand at creating a Robocode robot, there was quite a bit of trial and error as well as a heavy dose of perusing the Robocode API. WattDepot was no different. Although I had access to a WattDepot server which was already set up, and I knew that I had a wealth of predefined tools to help me interact with it, learning exactly how to do what I needed to do still took a bit of the proverbial banging of my head against the wall. Once again, in order to ease into a new system, I performed several katas to help familiarize myself.

1. Get sources. Print alphabetically.
One of the most basically things I learned quickly about WattDepot was that all the data came from sources. Each source had a variety of information associated with it, and one of the most useful bits of information is its name. In trying to alphabetize these sources, I tried to take advantage of the fact that the source names are Strings, and the String class's natural ordering happened to be, in a sense, alphabetical. Here, I "cheated" a bit and threw all the sources into a treeSet to shed the duties of sorting. Then the realization came that even if I were to sort the names, I wouldn't necessarily be able to keep each source's description associated with it. Luckily, Source objects in WattDepot have a compareTo method built in, and as fate would have it, the comparison was by name.

2. Get sources. Print from newest to oldest data.
The second thing I quickly learned about WattDepot was that there were SensorData objects associated with the sources. In order to get the timestamps of how recent the latest data was obtained from a particular source, I had to obtain the latest SensorData associated with each source. Here, I learned again how the compareTo method for this data type was implemented by throwing all the SensorData objects into a treeSet and printing them out. As luck would have it yet again, the objects were sorted by how "fresh" the data was, and printing out the name of the source the data came from was a simple exercise in parsing the String representation of the source.

3. Get sources. Print out their subsources. And their subsources. And their subsources...
With a heavy dose of String manipulation review in my mind now thanks to all the data parsing from the previous exercises, it wasn't particularly difficult to get split a source's subsources up and print them out. However, there was a bit of trouble here because in order to print out hierarchies, it made sense to make use of recursion. Unfortunately, upon investigation, all of the sources on this particular server had either only one level deep of subsources or none at all. What this means is that, even though I wrote my hierarchy printing code to be prepared for recursion, I was never able to fully test it out, since there is no recursion involved if there were no subsources that had its own subsources.

4. Get sources. Print energy consumed by that source... yesterday.
This particular kata or simple exercise took, by far, the longest of all. The goal was very similar to the first two exercises, yet the implementation had to be drastically different. When a query to the WattDepot database for the energy consumed by a particular source is made, the result is just the energy as a floating decimal number. In order to sort these numbers, I chose to put them all into an ArrayList. However, keep the numbers relevant, I needed a second ArrayList with the names to which the energy consumption data corresponded. Only then was I able to sort the lists and print out the results. This kata, however, took a longer time to implement because I hadn't had a lot of experience manipulating Calendar objects, but it turned out to be a surprisingly powerful and useful tool.

5. Get sources. Print highest recorded power per source... yesterday.
With my new found understanding of Calendars, and having already set the foundations for filling two lists which correspond to one another in order to sort them, this kata took a bit shorter to implement. However, due to the fact that we needed to find the highest recorded power by each source, we had to sequentially query the database a number of times per source in order to compare and find the highest recorded number for each source. This not only added more loops, decreasing the performance of the application, but added a lot more individual queries to the server, and overall, the execution of this particular kata was almost excruciatingly slow. In order to expedite the process, I decreased the query intervals to a mere two queries per source and when I was satisfied that the code was serviceable, I reverted back to shorter intervals between queries. Unfortunately, much like the conundrum between exhaustive and practical testing, I was only able run practical tests with longer intervals so as to avoid waiting for long periods of time just to test the execution.

6. Get sources (I see a pattern here). Print the average energy used in the last two Mondays.
Given the last two katas, the most trouble this one gave me was actually in the date manipulation. I found that, if you tried to set the Calendar object's day of the week to Monday, you will end up getting the upcoming Monday's date. What I needed was the past Monday's date. After wracking my brain briefly, I ended up going with the brute force method of the good ol' switch statement. If today is Monday, go back seven days. Otherwise, if today is Tuesday, go back one day and so forth until Sunday, where we go back six days. No, this was not the most elegant solution, but given how little I still understand about how Calendar objects fully function, it was a workable solution.

Through these exercises, I got to understand WattDepot a lot more and, as much as I hate to admit it, I learned a lot about Java that I may not have otherwise learned. While it is on a crude level so far, a lot of the energy data manipulation I learned from these exercises involved sorting data and formatting the output. This is deceptively important, however, because a big part of studying data involves presenting the data in ways that the data can be visualized and trends and patterns might be identified. I was able to complete these six katas to my satisfaction but I unfortunately neglected to jot down the specific amounts of times each one took to implement because I tend to do my coding with breaks interjected. Ultimately, however, it isn't the amount of time devoted to learning a craft that matters, but what you are able to derive out of the experience instead. In this case, I learned a lot more about WattDepot and how to manipulate the information I can attain from interacting with it by sitting down and devoting time to learning and I truly hope that I can inspire you to do the same. Even if it's not WattDepot, or even computer science... sit down and take the time to learn something today! After all, if I hadn't put in the time and effort, I'd definitely still be asking "what depot?"