Posts

Data Scientist / Algorithm Engineer in McLean, VA

SphereOI is a high performance studio with exceptional data scientists, engineers, and product designers. We value digital innovation that challenges the status quo by making products that are meaningful for commercial and government clients. There is no checklist mentality for product development at SphereOI. Our innovation follows a North Star vision and design with Strong Centers that keeps each product authentic to what is most meaningful to the customer. By centering our effort on what is most meaningful, we deliver transformative innovation.

This position is for a Data Scientist / Algorithm Engineer who will work in our studio to develop and operationalize advanced analytic solutions for our customers. You can expect to work on real world challenges in a wide range of applications such as telematics data exploitation, behavior profiling, cybersecurity threat detection or responsive agriculture analytics. Some of the technology disciplines you will be working with include attribution analysis, dynamic component influencer analysis, artificial intelligence, machine learning, and anomaly detection. The ideal candidate for this position has a deep knowledge of machine learning algorithms but also has a strong desire to get business value from analytics.

What You Will Be Doing

  • Develop predictive models using R, Java, or Python
  • Perform data reduction and normalization; Extract and combine information rich features
  • Participate in a weekly “Demo Thursday” where the team demonstrates work-in-progress
  • Recommend machine learning algorithms as well as suitable modifications
  • Develop, enhance, test and evaluate algorithms
  • Develop and conduct experiments for performance validation
  • What You Need for this Position

  • Bachelor’s Degree (Master’s Degree is preferred) in statistics, mathematics, physics, or related field
  • Experience developing statistical, mathematical and predictive models
  • Knowledge of supervised learning techniques such as neural nets, CART, regressions
  • Knowledge of unsupervised learning techniques such as clustering or segmentation

  • Apply now: recruiter@sphereoi.com

    Chief Data Scientist in McLean, VA

    SphereOI is a high performance studio with exceptional data scientists, engineers, and product designers. We value digital innovation that challenges the status quo by making products that are meaningful for commercial and government clients. There is no checklist mentality for product development at SphereOI. Our innovation follows a North Star vision and design with Strong Centers that keeps each product authentic to what is most meaningful to the customer. By centering our effort on what is most meaningful, we deliver transformative innovation.

    This position is for a Chief Data Scientist who will develop strategy and key technology direction for Fortune 500 customers. In this role, you will lead the development of innovative responsive analytics solutions to support customers in a wide range of domains such as telematics data exploitation, behavior profiling, cybersecurity threat detection and responsive agriculture analytics. The ideal candidate has four years of experience working in a data science role and experience mentoring a team. You will be working in an engineering-centric company culture as an individual contributor and team leader in a highly entrepreneurial environment.

    What You Will Be Doing

  • Provide data science strategic and technical direction
  • Develop predictive models using R, Java, or Python
  • Recommend machine learning algorithms as well as suitable modifications
  • Design and conduct experiments for performance validation
  • Mentor data scientists in algorithms, models, tools and products that make the team more productive
  • What You Need for this Position

  • Master’s Degree (PhD is preferred) in statistics, mathematics, physics, or related field
  • Experience developing statistical, mathematical and predictive models
  • Ability to explain complex models and analysis in layman terms
  • Knowledge of anomaly detection and signal processing is a plus
  • Experience with a programming language such as C++, C#, or Java is a plus

  • Apply now: recruiter@sphereoi.com

    Too Little, Too Late – Morgan Stanley could have prevented the Data Leak

     

    In a recent article about the Morgan Stanley insider theft case, Gregory Fleming, the president of the wealth management arm said:

    “While the situation is disappointing, it is always difficult to prevent harm caused by those willing to steal”

    Disappointing?  350,000 clients were compromised, the top 10% of investors, and this following a breach that left 76 million households exposed.

    Morgan Stanley fired one employee

    The fact is, this breach was preventable. Firms like Morgan Stanley are remiss in allowing these to occur, and are adding to the problem by perpetuating the myth that they cannot be stopped.  The minimal approach of repurposing perimeter cyber security solutions does not work.  These perimeter solutions and practices have been in place in each case of insider breaches including the U.S. government (i.e. Bradley Manning, Edward Snowden), Goldman Sachs, and the multiple Morgan Stanley breaches.  Even Sony Entertainment had some intrusion protection in place.  Cyber security professionals remain one step behind the criminals in defining events, thresholds, and signatures – none of these are effective for the insider.

    Building behavioral profiles for all employees, managers, and executives using objective criteria is the best, and possibly the only, feasible way to catch the insider.  Current approaches that focus the search for malicious insiders based on the appropriateness of web sites, or the stability of an employee based on marital situations seem logical, but provide little value.  There are a lot of people that get divorces that do not steal from their employers or their country.

    Rules and thresholds defined by human resource and cybersecurity professionals have proven ineffective at stopping the insider.  Data analytics using unsupervised machine learning on a large, diverse dataset is essential.  Sphere of Influence developed this technology and created the Personam product and company.

    Personam catches insiders before damaging exfiltrations.  It is designed for the insider threat, both human and machine based, and has a proven record of identifying illegal, illicit, and inadvertent behaviors that could have led to significant breaches.

    The malicious insider can be caught, and it is time to take the threat seriously and time to stop giving firms like Morgan Stanley (and Sony) a pass on their unwillingness to address the fact that they have people on the inside willing to do harm to their clients, their company, and in some cases, our country.

    If computers can beat Jeopardy! champions, why can’t they detect the insider threat?

    The world was awed two years ago when IBM’s Watson defeated Jeopardy! champions Brad Rutter and Ken Jennings. Watson’s brilliant victory reintroduced the potential of machine learning to the public. Ideas flowed, and now this technology is being applied practically in the fields of healthcare, finance and education. Emulating human learning, Watson’s success lies in its ability to formulate hypotheses using models built from training questions and texts.

     

    Three years ago, Army Private First Class Bradley Manning leaked massive amounts of classified information to WikiLeaks and brought to public awareness the significance of data breaches. In response to this and several other highly publicized data breaches, government committees and task forces established recommendations and policies, and invested heavily in cyber technologies to prevent such an event from reoccurring. Surely, we thought, if anyone had the motivation and resources to get a handle on the insider threat problem, it is the government. But, Edward Snowden, who caused the recent NSA breach, has made it painfully obvious how impotent the response was.

     

    Lest we assume this is a just government problem, enormous evidence abounds showing how vulnerable commercial industry is to the insider. We are inundated with a flood of articles describing how malicious insiders have cost private enterprise billions of dollars in lost revenue, so why has no one offered a plausible solution?

     

    The insider threat remains an unmitigated problem for most organizations, not because the technologies do not exist, but rather because the cyber defense industry is still attempting to discover the threat using a rules-based paradigm. Virtually all cyber defense solutions in the market today apply explicit rules, whether they are antivirus programs, firewalls with access control lists, deep packet inspectors, or protocol analyzers. This paradigm is very effective in defending against known malware and network exploits, but fails utterly when confronted with new attacks (i.e. “zero-days”) or the surreptitious insider.

     

    In contrast, acknowledging that it was impossible to build a winning system that relied on enumerating all possible questions, IBM designed Watson to generalize and learn patterns from previous questions and use these models to hypothesize answers to novel questions. The hypothesis with the highest confidence was selected as the answer.

     

    Like Watson, an effective technology to detecting the insider must adaptively learn historical network patterns and then use those patterns to automatically discover anomalous activity. Such anomalous traffic is symptomatic of unauthorized data collection and exfiltration.

     

    Inspired by the WikiLeaks incident, Sphere’s R&D team has investigated machine learning algorithms that construct historical models by grouping users by their network fingerprints. As an example, without any rules or specifications, the algorithms learn that bookkeeping applications transmit a distinctive pattern that enables grouping accountants together, and HR professionals are grouped by the recruiting sites they visit. These behavioral models generalize normal activity and can be used as templates to detect outliers. While users commonly generate some outliers, suspicious users deviate significantly from their cohorts, such as the network administrator that accesses the HR department’s personnel records. Like Watson, the models allow the system to form hypotheses.

     

    Applied to cyber security, every time an entity accesses the network, the algorithms hypothesize if the activity conforms to its model. If it does not conform, that activity is labeled an outlier. Because these methods use a statistical confidence that dynamically balances internal thresholds on network activities (e.g., sources and destinations, direction and amount of data transferred, times, protocols, etc.), it becomes extremely hard for a malicious insider to outsmart. Simply the fact that the system does not reveal its thresholds can have a significant deterrent effect.

     

    A paradigm shift in cyber technologies is happening now. Cyber security professionals agree that preventing data breaches from a malicious insider is a difficult task, and the past suggests that next major breach will not be detected with existing rules-driven cyber defense solutions. Next generation cyber security technology developers must seek inspiration from IBM’s Watson and other successful implementations of machine learning before we can hope to prevail against the insider threat.

    InsiderThreatDetectionSphere

     

    Wake up! It’s the insider threat you need to worry about

    THE INSIDER THREAT IS DETECTABLE

    AND LOSSES ARE PREVENTABLE WITH EARLY DETECTION

    Edward Snowden is the new face of the insider threat, the media even calls him the “Ultimate Insider Threat”.  This is someone who has the highest-level security clearance, endures a background reinvestigation every 5 years, takes a polygraph exam, and still betrays his sacred oath and trust of his employers.
    When it comes to asserting workforce trustworthiness, industry and government are both guilty of over-relying on employment pre-screening, background investigations, and oaths.  These are effective to a degree and good first steps but obviously inadequate when it comes to preventing losses and breaches.

    Insider threats are detectable because they don’t behave exactly like everyone else.  Maybe on the surface these people appear to be the same as their coworkers, but at some level their behaviors are different.  A sensitive enough instrument can detect such subtle differences in behavior, and if the noise of anomalies can be removed then high-quality actionable alerts can be generated from the “unusual anomalies”.  This is the basis of the Insider Threat Detection technology that has been developed by Sphere of Influence over the past two years.

    The problem isn’t cyber-security, which is focused on the threat of digital attacks against digital assets.  This is an industrial security threat, where a person of trust betrays that trust and misuses access to cause deep harm or substitute a third-party agenda.  Unlike cyber-attacks, an effective insider might not even use your digital assets as the vehicle for attack or exfiltration, they might steal files from a safe or do other things.  However, if a person’s normal behavioral modalities change even slightly then shadows of those changes are often reflected how they use the computer, thus computer activity can yield a behavioral profile for an individual, even if the actual threatening behavior is more analog than digital.

    InsiderThreatDetectionSphere
    By connecting a sensitive behavioral profiling instrument to a network we can construct individual profiles that are accurate enough to perform this type of anomaly detection. Such algorithm-synthesized profiles apply to human and non-human users of a network, giving some cyber-security crossover to this approach in addition to the industrial security focus. However, Insider Threat Detection is not cyber-security, it is industrial security that uses cyber-technology as a sensor.

    In our case the goal of this technology is to detect the active insider threat early in the activity cycle. We believe strongly that there is no way to fully prevent insider threats from occurring because no background screening process on Earth will ever accomplish that. To defend against the insider we believe early detection of active threat behaviors is the key to loss prevention.
    This is possible thanks to Advanced Data Analytics (Analytics 2.0) techniques which evaluate dozens (or even thousands) of simultaneous feature dimensions on Big Data under a powerful layer of unsupervised machine learning. What makes insider threat detection different from conventional Analytics 2.0 is that it must work on streaming data, in real-time, and at-scale.

    At Sphere of Influence, because we have been so invested in Advanced Data Analytics these past few years, we were able to solve these problems and invented an instrument that does what I describe here.  We use it every day on our networks and it is already installed at beta customers, primarily law offices.

    The bottom line is that even the most intense background checks are not good enough, you need to be able to detect insider threats when they become active and before those threats move to Hong Kong.

    InsiderThreatDetectionSphere