malloc47/

2014 0007 0028

Since finishing my dissertation, I decided to gather some metrics across the related repositories. Pulling the raw number of commits with (roughly) this:

find . -name .git
    | xargs -I {} git --git-dir={} log
                      --all
                      --author=$(whoami)
                      --pretty=format:"%H"
    | sort | uniq | wc -l

I realized that I’d crossed the 10,000 commit threshold right before graduating. Which seemed appropriate.

While the Gladwellian 10,000 hours heuristic (debatable as it may be) for mastering a craft fits as a general cross-discipline measure, I’d conjecture 10,000 commits is a more fitting measure for software engineers. It’s difficult to imagine reaching 10,000 commits without having gone through a full software lifecycle, probably more than once. And counting commits instead of hours has the advantage of each being a visible, presumably atomic, and (lightly) documented bit of work, where the prerequisite (actually using/understanding version control) is a good indicator of investment in the craft. For those of us who may have exceeded 10,000 hours tinkering with “programming” before finishing high school, having a goal that requires the discipline to document your progress may be more helpful than 10,000 unstructured or undocumented hours of hacking.

Particularly when completing a CS Ph.D., commits to research software, open source patches, version controlled manuscripts, research notebooks, etc.—when taken together—are rarely going to number much less than 10,000 if you’ve truly produced enough work to graduate. Similarly, though I’ve not stayed in a junior software developer role long enough to be promoted, crossing the 10,000 threshold sounds more than ample evidence of outgrowing the role.

When a software project hits 10,000 commits—no matter how ugly it might be—it’s easy to imagine it being fleshed-out and mature. I’d like to think engineers might be too.

2013 0004 0003

LaTeX Snippet: (Literal) One Liners

There are some truly impressive LaTeX solutions for doing PowerPoint-style font-resizing to fit into a fixed width box. I recently had need of something more simple: print text on one line only, scaling the size down instead of allowing it to wrap. The following LaTeX snippet does exactly this, triggered only if the font width (before wrapping) exceeds \textwidth.

{
  \def\formattedtext{The no-wrap text to scale}%
  \newdimen{\namewidth}%
  \setlength{\namewidth}{\widthof{\formattedtext}}%
  \ifthenelse{\lengthtest{\namewidth < \textwidth}}%
  {\formattedtext}% do nothing if shorter than text width
  {\resizebox{\textwidth}{!}{\formattedtext}}% scale down
}

This requires

\usepackage{xifthen}
\usepackage{graphicx}

to handle the \ifthenelse, \lengthtest, and \resizebox statements.

It works like you might expect: check the width of the text, and then use \resizebox to scale it down, if needed. Such logic isn’t always obvious in LaTeX: arbitrary defs cannot store length information, so you have to set the type of the \namewidth variable as a dimension before you can assign/test it as a length.

As with most helpful things in LaTeX, we can wrap it up in a reusable macro:

\newcommand{\oneline}[1]{%
  \newdimen{\namewidth}%
  \setlength{\namewidth}{\widthof{#1}}%
  \ifthenelse{\lengthtest{\namewidth < \textwidth}}%
  {#1}%
  {\resizebox{\textwidth}{!}{#1}}%
}

which allows

\oneline{\Huge{The no-wrap text to scale}}

\oneline{\Huge{The quick brown fox jumped over the lazy dog, over and over and over and over again.}}

On any reasonable-sized page width, these two lines will not wrap, but the longer line will be stretched horizontally to fit in the available space.

You can find a fully-working (as of TeXLive 2012) gist here.

2013 0002 0024

Tech Company Locations (North America)

Since I'm soon to be on the technology job market, I decided to get a handle on where the major technology companies were located. After spending some quality time going back and forth bettween the "Career" sections of different companies and Google Maps, I assembled this:

There are a number of disclaimers here:

This is geared primarily toward locations that emphasize software engineering roles. I tried to filter accordingly if there are few/no engineering jobs listed for a paricular location, but that may not be perfect either.
Few companies provided the exact address (Microsoft being one of the exceptions) for every one of their locations, so if Google Maps searches failed, then I just approximated.
This is certainly not exhaustive; many companies (namely Amazon) have many subsidiaries under different branding, which may or may not have any engineering roles.
This doesn't include data centers, retail stores, etc.

Let me know if you see any gross errors, or would like me to include any other companies with more than one location. You can get a better view on Google Maps.

2013 0002 0017

Laptops are the Stenotypes of Software Engineers

Increasingly, I’ve been asked variants on the question “what will happen to desktops/laptops,” particularly in light of the proliferation of smartphones and tablets. This has resulted in several good conversations, and I’ve begun to use the following analogy when discussing this with colleagues in non-engineering disciplines:

Laptop computers will become the stenotypes of software engineers.

The stenotype is a niche device used by stenographers (most prominently, court reporters) to transcribe dialog in real-time at blindingly-fast words-per-minute rates. Fellow Emacs users might appreciate how the stenotype works: instead of typing single letters, multiple keys are “chorded” together to allow many more combinations with many fewer keys. And instead of producing single letters, many of these combinations produce syllables or whole words. These physical optimizations are coupled with conventions among stenographers to wantonly omit or abbreviate words where there is little ambiguity of meaning, which further improves efficiency—the fewer characters put to a page, the less chance of typos. This leads to the output of the stenotype being difficult to read by those who are not well-versed in the conventions (called “theories”) used by stenographers.

One could describe the typical QWERTY keyboard as the antithesis of the stenotype. Designed to reduce jams in the old typewriting systems—a constraint clearly nonexistent in modern hardware—mainstream keyboards are considered more accessible compared with a stenotype. However, even considering later iterations, such as the Dvorak layout, these keyboards cannot hope to match the ruthless efficiency of a stenotype when used in real-time transcription work.

Few would reach for a stenotype to write a letter to their mother (unless she was a stenographer herself!), and no trained court reporter would care to wonder into the courtroom with a QWERTY keyboard. Despite digital recorders making inroads into court reporting and closed caption transcription, stenotypes are still available (hint: ebay), persisting thanks to an ingrained base of stenographers who remain ruthlessly efficient at these highly-specialized tasks.

Which brings us back to computing.

Software engineers are not your average user. We don’t have an average computing workload and have a completely disparate set of tools and conventions. Dell didn’t even try to lampshade this fact with their Sputnik Ubuntu laptop aimed squarely at developers. And, while tablets have proven to be capable of everything from typical computing tasks to basic software development,

If it doesn’t have a keyboard, I feel that my thoughts are being forced out through a straw.
—Joey Hess

The immanent death of the laptop is greatly exaggerated; after all, the stenotype lives on to this day. However, the fate of the laptop as we know it—available in every imaginable color, style, shape, size, and brandishing shiny logos to reinforce its reputation as a status symbol—is less certain. This fact was made more stark for me when I realized that installing Android x86 on my Eee PC gave it more in common with modern computing platforms than my primary development laptop (Thinkpad x120e with Arch Linux and the minimal xmonad window manager). And I know I’m not the only developer who has, consciously or unconsciously, increased the specialization of my desktops/laptops while using smartphones/tablets for non-work related activities. Who really wants to pull out and boot up a laptop for light internet reading when you’ve got an instant-on smartphone or tablet within reach? I’m becoming more convinced that this is what the real “death” of the laptop looks like.

For software engineers, the role of laptops (and desktops even moreso) is slowly morphing into one similar to that of stenotypes. It’s arguable that this is inevitable: artists don’t use college-rule paper, chemists don’t conduct titrations in a coffee mug, and firefighters don’t roll up in a corolla to put out house fires. Professions evolve better and more efficient methodology, and when professionals outgrow the prevailing tools available to consumers, they develop new tools. This has already been well underway on the software side—you won’t find lawyers writing briefs with vim, after all.

If consumer computing devices become more recalcitrant for software engineers, it makes sense that a professional-grade tool should fill in the gap. The most natural candidate is the form factors we already have: laptops and desktops. But that certainly doesn’t limit innovation to current devices—a positive side effect of this “death” of laptops is that it provides a great opportunity to rethink what sort of device would benefit developers most, without being strictly constrained by a 1980s design that was intended for general computing. And this prospect may not be as far away as it might seem, given the recent rise of hardware startups.

The next time you pull out a laptop in a coffee shop, I don’t anticipate you’ll get the same quizzical looks you might receive if you brought a stenotype with you. But, like the stenotype, I do think that the proliferation of tablets and other more consumer-oriented devices will necessitate a professional class of devices that are less common and more specialized. And in the meantime, that might mean the stylishness of laptops will begin to wane. I’m okay with that.

[Incidentally, you can turn a conventional keyboard into a stenotype-like device with Plover, an open-source stenotype software package.]

2013 0001 0029

Producing LaTeX from NumPy Arrays

For my comprehensive exam, I needed to quickly convert some NumPy arrays into nice-looking LaTeX array elements. The TeX Stack Exchange site has a good answer for tabular environments, but wasn’t quite suited to the array environment. The usual answer here would be Pweave but, being short on time, I ended up rolling my own function instead:

def to_latex(a,label='A'):
    sys.stdout.write('\[ '
                     + label
                     + ' = \\left| \\begin{array}{'
                     + ('c'*a.shape[1])
                     + '}\n' )
    for r in a:
        sys.stdout.write(str(r[0]))
        for c in r[1:]:
            sys.stdout.write(' & '+str(c))
        sys.stdout.write('\\\\\n')
    sys.stdout.write('\\end{array} \\right| \]\n')

Here’s an incomplete snippet of it in action, where I convolve an array t with four different filters, producing a latex formula for each result:

filters = (('A \\oplus H_1',h1)
           , ('A \\oplus H_2',h2)
           , ('A \\oplus H_3',h3)
           , ('A \\oplus H_4',h4))

for label,f in filters:
    t2 = scipy.signal.convolve(t,f,'same')
    to_latex(t2.astype('uint8'),label=label)

I’ll likely get around to expanding this into a full package sometime in the future, since there’s a lot that is hard coded (the \[ \] environment, stringification of the array, the fact that all columns are centered, etc.). A gist of the function is available here.