Challenging Software Management

Seldom does an article pluck a thought lurking in the corners of our consciousness, place a spotlight on it, and then reveal it holds the key to unraveling deeply seated beliefs. Matthew Stewart’s The Management Myth is such an article, a rare piece that takes a practical look at management theory – the history, education, and personal experiences. Reading the article, I found myself thinking, “Finally a management consultant giving an insightful critique of management rather than hype!”

In his article, Stewart explains how Frederick Winslow Taylor came up with the first industrial-era ideas of management theory in 1899. Taylor was working for the Bethlehem Steel Company when he invented a scheme to get workers to load pig iron bars onto rail cars more quickly. He later went on to apply his approach to other business problems and write a book titled, The Principles of Scientific Management.

Even at the time, it was clear that Taylor’s conclusions were more pseudoscience than science. Taylor never published raw data for his pig iron studies. And when Taylor was questioned by Congress about his experiments, he casually admitted to making reckless adjustments to the data, ranging from 20 percent to 225 percent.

Despite serious protocol flaws and Taylor’s failure to adhere to even the spirit of the scientific method, management has doubled down by embracing empirical measurements that are meaningless indicators. The purpose of these number exercises is to convince business leaders that the right things are happening. The belief that if you don’t measure it, you cannot manage it continues unabated within unproven methodologies such as Earned Value Management (EVM) or the prioritization of software backlogs using ROI, NPV, or other numerical or financial metrics. The fact that these numbers are fictitious hasn’t slowed anyone down from using them.

But here is the question Stewart brilliantly points out in his article that sharpens the argument against these practices. How is it possible that empirical management continue to be used when the theories and approaches themselves are not held accountable to the very same metric disciplines they force on everyone else? Every development team must prove, using numbers, that they are on track – but nobody has proven that such accountability is effective.

I can confirm, anecdotally through my personal experience, that project management “number exercises” do not lead to improved performance, better risk management, higher quality, or customer satisfaction. In my experience, for what it’s worth, the more “sophisticated” a management approach, the more likely it will have the exact opposite effects.

In my next blog, I will talk about what we need from software management, which is to set constraints to resist the natural temptation to build “something” rather than the right thing.

Incorporating Strong Center Design Into an Agile Project

Strong Center Design is an approach we developed at Sphere of Influence to unify software design where every feature and design choice enhances a product’s impact by creating a single, powerful impression. Our approach is an alternative to random design choices that lead to a mish-mash of competing centers.

One of the challenges we overcame was integrating Strong Center Design with an Agile culture where it is a matter of ritual to prioritize features and design choices on an iterative and incremental basis. To integrate Strong Center Design with Agile, we considered five distinct approaches and examined the weaknesses of each one.

1. Design up-front
Insert a dedicated design step (much bigger than a sprint-zero) before launching the first Agile sprint. A good analogy is the ‘pre-production’ step used when filming Hollywood movies.
Weakness: Has all the same pitfalls of waterfall (phase-gate) development.

2. Design in each sprint
Do a little design at the beginning of every sprint.
Weakness: There is never enough time to develop a good design without delaying developers. Either design or productivity suffers.

3. JIT design a few sprints ahead of development
Two separate parallel workflows in each sprint: one for design (about two sprints ahead) and the other for development. Use Kanban signals to trigger JIT design work before it is needed by the next sprint.
Weakness: Not ideal because the further ahead design gets, the less Agile the process becomes. A design team can also end up supporting several different sprints at the same time – the next design sprint and the current development sprint.

4. Dedicated design sprints
Sprints oscillate between design-focus and development-focus.
Weakness: Everyone is tasked to work design during design sprints even if they lack the skill or desire to work on the design. The reverse is true for development sprints.

5. Designer and developer partnerships
Known in academia as ‘fused innovation’, this pairs a design professional with one or more developers.
Weakness: Federating designers contradicts the objective to achieve wholeness in the design. It is difficult to implement with a distributed team.

To make Strong Center Design compatible with Agile, we discovered a hybrid approach worked best, formed from two of the options.

First, do a little design up-front. Don’t design the entire product, but take some time to establish a strong conceptual center something we call the North Star. The North Star creates a unified focus that everyone can agree on. The North Star also fleshes out the design language plus any design principles that will shape the product. Do this work up-front before the first sprint.

Once development starts, we found the best way to achieve Strong Center Design is to replace the typical Product Owner role in Agile with a designer to lead the team. This is the most controversial aspect of our approach, as many people regard consensus-based (i.e. collaborative) prioritization as a core tenant of Agile. However, relying on consensus to prioritize design choices tends to optimize a single part of the product at the expense of the whole.

While not perfect, this blend succeeds in giving us the best of both worlds. We get a design driven product with a single unified identity. It also gives us the production efficiency of Agile.

Vet Your Agile Advisor with 5 Questions

Whether an internal hire or a hired gun, onboarding someone to advise you about Agile is a risky but important gamble. Not all Agile advisors are equal; there are good ones, bad ones, and some in-between ones.  How can you tell the difference?

Just to be clear – Agile is the worldwide standard for software development today. However, Agile is a philosophy, not an instruction book. Like any philosophy, Agile is susceptible to variation in interpretation and implementation. Such variation has extreme consequences, sometimes creating a fast-paced highly productive culture and other times a lumbering beast that drains budgets with low workforce engagement and questionable delivery throughput. It should come as no surprise that simply ‘being Agile’ is no cure for anything.

I often work with post-Agile organizations; i.e., organizations that fully transitioned to Agile Software Development. The top complaint I hear from executive management in these organizations is that productivity is the biggest disappointment.

I agree that many organizations suffer chronic productivity problems even after transitioning to Agile. The root cause isn’t Agile itself, it’s the type of Agile.

As a philosophy, Agile is non-hierarchical self-organizing egalitarianism, consensus decision-making, community ownership, deep collaboration, transparency, and open communication. From this perspective, Agile could be a 1970’s era hippie commune, complete with spiritual leader, a.k.a. ‘Agile Coach’.  Viewed from this perspective there is no emphasis to drive workforce engagement, productivity, high-margin returns, or blistering speed.

Relax man! We’ll get there when we get there

Vet Agile Advisor - 5 questions - artwork

However, the hippie bus is not the full story with Agile. The Agile philosophy also embraces extreme disciplines around test generation and test execution, continuous integration, individual craftsmanship, proper software engineering, lean workflow optimization, small elite teams, and continuous delivery. Agile can be aggressively aerodynamic packing tons of horsepower into a small footprint, if that’s your thing.

Unfortunately, few Agile practitioners go deeper than the commune-style egalitarianism aspects of Agile.

Vet Agile Advisor - 5 questions - Executive Perspective

Imagine yourself in the position of vetting a new Agile advisor. What questions do you ask?

Here are 5 questions we designed to help you vet whether someone is performance-oriented:

1 – How does Agile fail?

If they answer with some variation of “lack of buy-in from management or stakeholders” then run, because they plan to blame you for any failures. As with all philosophies, Agile can fail; and it does…predictably. Someone with real experience squeezing high-performance out of Agile should be familiar with its modes of failure. Those who sprinkle ‘Agile fairy dust’ over organizations, generally refuse to accept that Agile can and does fail. These people will be unable to articulate the circumstances under which it fails or why. Likely those people will answer your question from the ‘Agile is perfect – everything else is the problem’ perspective. That perspective is naïve and not at all useful.

2 – If productivity is my #1 concern, how does that impact Agile?

I’m not suggesting productivity should be your #1 concern, but you need to know how your Agile advisor reacts to such prioritization. What components of their approach to Agile would be emphasized or deemphasized to optimize for productivity? Lower-end advisors will avoid directly answering the question and focus on the definition of ‘productivity’. Productivity is seen by lesser Agilists as an outdated 20th century ode to Taylorism. They’ll talk about how it’s not proper to think in terms of productivity in the modern age and how software organizations are much more complicated than that. Of course, this attitude is partly to blame for so many IT shops that have chronic productivity problems after transitioning to Agile. Not only do many Agile advisors not know how to optimize for productivity, they don’t recognize productivity as something important.

3 – If innovation is my #1 concern, how does that impact Agile?

This question is intended to separate the master class from the middle. This may come as a surprise, but innovation is rarely discussed within Agile circles. It’s is not an ugly word, like ‘productivity’, it’s just not discussed much. The lesser Agilists believe innovation is a natural ‘happy accident’ that comes from self-organizing egalitarianism, consensus based decision-making, community ownership, deep collaboration, and open communication. These people lack a basic understanding of innovation leadership and certainly don’t know how to incubate it alongside Agile. The truth is, innovation is not addressed by Agile. If innovation is a priority, additional workflows are necessary to generate, select, and develop fresh high-impact ideas.

4 – How will I know if Agile is successful?

A gripe many executives express 3 (or 5) years after embarking on an Agile transformation is – they are still transforming. When does the train arrive at the station? How can you tell when Agile is working? Just as lesser Agilists are unlikely to blame Agile for any failures, they are equally unlikely to promise measureable improvements or success. A grittier person will take this question seriously and put their reputation on the line. Think of it this way, if your organization doesn’t feel a sharp improvement in feature delivery and quality at lower production costs … why exactly is Agile being embraced?

5 – If we only adopted three things from Agile, what three practices will make the biggest difference?

This question separates the bottom of the barrel from the middle. If the answer covers daily standups, retrospectives, planning poker, pair programming, story walls/backlogs, or anything like that then fail. If their list of three includes ‘iteration’ then that’s borderline okay – but iteration is hardly unique to Agile (even Waterfall had it) and should be done regardless. If their list includes Continuous Integration or Continuous Delivery, then gold star; in fact it’s hard to envision a correct answer that omits those two. Also give gold stars for advanced Test Automation, but the answer must go beyond the mere basics of creating unit tests that will inevitably wind up in the ‘technical debt’ pile; i.e. the answer must key on ‘advanced’.  The double-gold star goes to the person who prioritizes ‘technical excellence’, particularly with respect to team member selection. It’s not that the 4 values and 12 principles of Agile aren’t all important, but some are way more important than others. Does your Agile advisor know which those are, or at least does she/he have a strong opinion?

Finally, for extra credit you could ask them how they would implement Agile without Scrum.  Many practitioners of Agile only know Scrum (it takes < 1 hour to become a Certified Scrum Master). Challenging your Agile advisor to formulate an approach to Agile without Scrum will test their knowledge of ‘first principles’ rather than rely on their hour of training. Like asking an artist to paint something in black and white; sure they have skill to paint with color – but it’s a good test to see if they can make something beautiful from a smaller palette.

Happy Data Privacy Day – Why Don’t I Feel Safer

Happy Data Privacy Day – Now stop the hysteria.

In honor of Data Privacy Day (Jan 28th), we must point out how the hysteria over surrounding privacy has created an irrational fear that slows adoption of important technologies and actually hurts people as a result.

Privacy is a serious matter. We all know someone who has had their identity stolen. The financial loss, inconvenience, and personal violation cause identity theft to rank alongside health issues as one of the worst things that can happen to an individual. Our ever expanding digital footprint creates a target-rich environment for criminals that exposes deeply personal matters of finance and individual privacy.

Sadly, many so-called privacy advocates are exploiting this fear to insure their own relevance. They are using opportunities like “Data Privacy Day” to convince consumers to avoid “big data” and opt-out of many modern conveniences. They juxtapose modern data hungry digital services with identity theft, leaving the consumer afraid and confused. The advice is to “just say no” to all matters of digital consent, particularly if there is big data connected to a big company. They promote the idea that large corporations are looking to steal our assets and make huge profits from the details of our lives, leaving us exposed and compromised in the process.

Of course there are unscrupulous companies in today’s world that should not be entrusted with your personal information. There are many firms out there that have historically done a poor job of managing their consumer relationship. A company that positions profits before brand and consumer trust is hardly the model we strive. Instead as consumers, we should insist that these companies implement the kind of rigor that secures our personal data, maintains our privacy, contributes insight and provides consumers the means and option change their mind.
So let’s not throw the baby out with the bathwater.

The advice and fear mongering promulgated by these Ludites provides no advantage in the digital age. Suggesting that we can and should opt out of modern digital services that aggregate data is like asking us to keep our life savings in our mattress. Should we store our flash drives under our mattress or in our freezer for safekeeping where it will be safe from outside use? Those who do will suffer significant disadvantages compared to those who participate. Deprecating capital assets should be put to better use.

Analytics on Big Data opens doors and offers insights that were never before imagined. Computational analytics on large data sets changes the game in in Agriculture, Health, Automotive, Energy, and almost every other sector. Large aggregated data sets allow science to discover the weakest of signals and amplify those signals in ways that produce predictive and informed insights. That is, unless consumers are frightened into thinking the risks outweigh the rewards.

Instead of alarming consumers about the dangers of participating, we should provide consumers with the facts about big data and the role privacy plays. Privacy advocacy groups would better serve their constituents by detailing the questions people should ask and provide specific demands consumers should make of their personal data suitors.

We at Sphere of Influence have developed our Data Privacy Rules of Engagement. We believe these types of rules are a good approach for consumers who want to be sure that a request for their personal data will yield collective results without compromising their identity.

Sphere of Influence Data Privacy Rules of Engagement
1. First Do No Harm
2. Preserve the Public’s Right to Know
3. Preserve Consumer Right to be Forgotten
4. Preserve Consumer Right to be Remembered
5. Keep Relevancy Relevant

These competing rights and the privacy of the personal components of data can be accomplished through a robust application of process and technology. This in turn can keep private data private, while still allowing aggregated anonymized data to benefit consumers and society at large. These techniques are comprised of a combination of anonymization, multi-mount-point architecture, split repositories for private and service accessible data and a comprehensive service layer that only provides access to the data to which it has rights.

Conclusion

Big data is streaming off our vehicles, portable devices and consumer electronics portraying an important and valuable digital imprint of ourselves. There is no putting the genie back in the bottle. Rather than use hysteria to make consumers run for the hills, we should accept the reality of today’s digital world, embrace the opportunity for advancement in science and insist on a comprehensive approach to data privacy from companies that use them.

Chris Burns, Director
Sphere of Influence Software Studios
-A Premium Analytics Company

Too Little, Too Late – Morgan Stanley could have prevented the Data Leak

 

In a recent article about the Morgan Stanley insider theft case, Gregory Fleming, the president of the wealth management arm said:

“While the situation is disappointing, it is always difficult to prevent harm caused by those willing to steal”

Disappointing?  350,000 clients were compromised, the top 10% of investors, and this following a breach that left 76 million households exposed.

Morgan Stanley fired one employee

The fact is, this breach was preventable. Firms like Morgan Stanley are remiss in allowing these to occur, and are adding to the problem by perpetuating the myth that they cannot be stopped.  The minimal approach of repurposing perimeter cyber security solutions does not work.  These perimeter solutions and practices have been in place in each case of insider breaches including the U.S. government (i.e. Bradley Manning, Edward Snowden), Goldman Sachs, and the multiple Morgan Stanley breaches.  Even Sony Entertainment had some intrusion protection in place.  Cyber security professionals remain one step behind the criminals in defining events, thresholds, and signatures – none of these are effective for the insider.

Building behavioral profiles for all employees, managers, and executives using objective criteria is the best, and possibly the only, feasible way to catch the insider.  Current approaches that focus the search for malicious insiders based on the appropriateness of web sites, or the stability of an employee based on marital situations seem logical, but provide little value.  There are a lot of people that get divorces that do not steal from their employers or their country.

Rules and thresholds defined by human resource and cybersecurity professionals have proven ineffective at stopping the insider.  Data analytics using unsupervised machine learning on a large, diverse dataset is essential.  Sphere of Influence developed this technology and created the Personam product and company.

Personam catches insiders before damaging exfiltrations.  It is designed for the insider threat, both human and machine based, and has a proven record of identifying illegal, illicit, and inadvertent behaviors that could have led to significant breaches.

The malicious insider can be caught, and it is time to take the threat seriously and time to stop giving firms like Morgan Stanley (and Sony) a pass on their unwillingness to address the fact that they have people on the inside willing to do harm to their clients, their company, and in some cases, our country.

Dealing with your crappy data

Let’s confront a real problem in Big Data. Inside your data warehouse lurk errors that could potentially render your data as useless as a thousand disks packed with random numbers. With all the hype in the industry around storage, transfer, data access, point-and-click analysis software, etc., more emphasis should be placed on detecting, evaluating, and mitigating data errors.

If you are thinking, “my data contains no errors”, you are living in denial. Unless you are storing output produced by a closed-form mathematical expression or a random number generator, you should recognize that any sensor or data collector — whether an industrial process temperature sensor, a bar code reader, a social media sentiment analyzer, or a human recording a customer transaction — generates some error along with the true value.

In its basic form, the data residing in your data warehouse, the “measured value”, is a combination of the true value plus some unknown error component:

 

Measured Value = True Value + Unknown Error

 

This expression is valid for continuous variables where differences can be quantified (like position, account balance, number of miles between customer and retail store, etc.) It is also valid for categorical variables, discrete values that can be ordered (like level of income, or age category), or unordered (like product category or name of sales associate).

Now, if we knew the error for each measured value, we could simply subtract it and restore the true value. But, unfortunately, that is usually not possible (otherwise, we would just store the true value right away), so we have to characterize the error as an uncertainty:

 

True Value = Measured Value ± Uncertainty

 

Clearly, if the error is significant, it can hide the true value, resulting in useless and potentially misleading, deeply flawed analytics. One of the fundamental responsibilities of a data scientist is to characterize the uncertainty in the data due to the error, and know (or estimate) when the error could contribute to faulty analysis, incorrect conclusions, and ultimately bad decisions.

Leeds General Infirmary in England shut down its children’s heart surgery ward for eleven days in March 2013 as errors and omissions in patient data lead the hospital’s directors to incorrectly conclude that the child mortality rate was almost twice the national average. We assess that had these data errors been properly detected and mitigated, the results of the analysis would have been accurate, and the decision to shut down the ward would not have occurred, and more patients could have been treated. Source: http://www.bbc.com/news/health-22076206

 

So, what are the types of errors in my data?

Data errors occur in all shapes and sizes, and uncertainty analysis should consider both the type of error, and the relative magnitude. A multitude of methods of error mitigation are available, but they are only effective for specific types of error, and could even enhance the error data if applied carelessly. Let’s decompose the unknown error into random errors and systematic errors, because the ways of handling them are very different:

Measured Value = True Value + Random Error + Systematic Error

 

Noise — the Random Errors

Suppose a sales associate repeatedly types in customers’ names, phone numbers, and addresses, and every once in a while, makes a typo – a replacement, deletion, or insertion error. But any previous typo has no influence on the next typo; these typos are assumed to be independent of each other. Or suppose, you are recording temperature at various times and places in an industrial plant, where each sensor, reports a variation about an average, and the average is an accurate representation of the true temperature for a time and location. In both examples, the errors have no sequential pattern and are entirely random, or noisy.

The nice thing about random error is that they are readily handled by general-purpose statistical methods that you can find in mainstream statistical software packages. Noise can be estimated and smoothed out, with simple averages and standard deviations, or with more advanced filtering or predicting techniques, such as Kalman Filters or Spectral Analysis (such as those based on Fourier or Wavelet Transforms). Other noise sources, such as the random typos above, can be readily fixed using statistical parsers or spelling or grammar checking techniques.

 

Bias — the Systematic Errors

Just like a wristwatch that is running ahead by three minutes, systematic errors maintain consistent patterns from measurement to measurement. An important distinction between random errors and systematic errors is that while random errors can be handled statistically, systematic errors cannot (Taylor, 1997).

Unfortunately, biases – fixed or drifting offset errors – tend to work their way into most measurements from all types of data collectors, from uncalibrated sensors to inadequately sampled survey populations. In fact, the only real way to detect or eliminate bias is to compare with some form of truth reference. For example, one of our customers presented us with data sets of physical measurements from heavy land machinery, where each machine reported large quantities of data with different biases. We quantified bias characteristics by comparing the machinery measurements against a validated baseline for a limited subset of the data. Then, by applying relaxation algorithms, we were able to minimize the bias errors relative to the machines.

Suppose we survey a consumer group about product preferences from urban communities in one large metropolitan area. Even with very large sample sizes, are these results meaningful for rural populations? Are they representative of other cities throughout the US? If we gathered even more responses from the same city, would our data be a better approximation for the US? The answer is most likely no in each case, because the data is biased toward the preferences of the sampled population.

Another representative example of sampling biases that we see regularly – based on the Nyquist criterion — suppose we want to compute the slopes along a particular road from recorded GPS elevation measurements at regular quarter-mile intervals. The obvious problem with this approach is that any slope between two hilltops separated by less than half a mile, will be aliased – the slopes will appear much smaller than they are in reality. (This is the same type of problem that causes the wagon wheels to appear to rotate backwards in old western movies.)

We regularly see such biases working themselves undetected into analytics that could lead to bad decisions. In our experience, detecting and mitigating bias is much more challenging than dealing with random error, because it requires an intimate knowledge of the domain and because standard statistical methods are not generally applicable.

 

How to make your crappy data useful

Now that we have described how error can dramatically reduce the utility of your data, what should one do to mitigate its bad influence on analytics?

First, know your data and quantify its uncertainty. Understand the conditions and environment under which the data is collected, then for a representative part of your data, find a trustworthy baseline to compare against. Using the baseline to “reverse-engineer” the errors, quantify the random and the systematic errors separately, realizing that the mitigation techniques will be quite different. Describe the spread of the random error with the common measures, such as the standard deviation; describe the systematic fixed or varying offset errors, as means and slopes over sequential segments in the data. It is especially important to characterize the uncertainty when merging in new data sources to ensure that the new data doesn’t enhance the errors significantly.

Second, understand how error can affect your analysis. We frequently use a simulation for a sensitivity analysis, starting with an error-free condition and progressively increase the random and systematic errors, until we detect a significant reduction in performance. Suppose we have a model that predicts whether an automobile service customer will return to the service bay, in part based on the distance the customer lives from the service bay. We can then insert different error conditions on the distance variable, and empirically determine when the model fails to predict customer behavior reliably.

Third, apply error mitigation as a preprocessing stage. From our experience, many analytic tools, such as classifiers can perform better when the random error is smoothed out. Unmitigated biases can propagate inconsistent data features into downstream analytics, so it useful to first determine the regions in the data that are potentially affected by bias. Assuming the high-bias regions can be identified, they can be excluded if it is not possible to mitigate the bias error. Detection and mitigation of bias is specifically tailored to the type of data and the method of collection.

So, how crappy is your data?

Do you know if the errors are affecting your results, and providing potentially flawed “insights”? Are you tracking the noise or the signal? Is your data so corrupted by error that any advanced analytics lead to contradictory conclusions? If so, you may need to refocus your corporate data strategy on more enhanced error characterization and mitigation techniques.