Posts

Data Scientist / Machine Learning Expert in McLean, VA

SphereOI is an award winning digital studio devoted to using Artificial Intelligence (AI) for enterprise clients. We conceive, develop, train, and engineer production-ready machine learning solutions to achieve superhuman performance. More than a data science shop, we are an engineering studio built on data science. Our approach is an unusual blend of machine learning with design-thinking and production software engineering. We make things.

This position is for a Data Scientist / Machine Learning Expert who will work in our studios as a member of our data science team. The position will support multiple clients over time, crossing a broad spectrum of industries. The pace is fast, but the technology and opportunities to experiment with new approaches are exciting. The position requires a significant proficiency with statistical science in the context of machine learning. The work involves exploration, experimentation, and implementation. Programming skill is essential. The more varied your background, the better. Data, big and small, is a ubiquitous element, and familiarity with modern data architecture and query languages is important. This position will, over time, draw on your skills in NLP, computer vision, cyber-physical systems, and deep/reinforcement learning.

 What You Will Be Doing

  • Learn new domains and problems
  • Experiment with models and data
  • Push the boundaries of ML in computer vision, agent design, NLP, and other areas
  • Collaborate with other experts
  • Develop reference implementations of models and pipelines
  • Work in Python, possibly other languages (e.g. R, Java)
  • Present at weekly demos and various sessions
  • Support production engineering teams and develop automated tests

What You Need for This Position

  • Talent for machine learning fundamentals  We stress the fundamentals
  • An advanced degree in math, statistics, data science, physics or computer science is preferred, but not required if you have experience in the field
  • Hands on coding experience using Python, R, Java, C#, or SQL is preferred
  • Experience with TensorFlow or similar frameworks
  • Experience with scalable data architectures (e.g. Spark, Hadoop) is a plus

 

Interested? Send us any comments and your resume.

 

Sphere of Influence Expands Data Analytics Studio

Sphere of Influence – a leader in value add data science for high-volume, high-velocity and high-variety information assets – today announced continued investment in its McLean, VA operations where it has doubled its data science team over the past year.  The company – which recently expanded operations into Denver, CO – is also growing its digital solutions team.

The expansion of the Sphere of Influence data science studio coincides with the ramping-up of the company’s latest offering – analytics that predict customer experience for software systems.

“Our team of data strategists, data scientists and software developers has been creating exciting innovations that will make a real difference for businesses in competitive markets,” said Sphere of Influence Director of Accounts, Scott Pringle.  “Sphere of Influence has taken the steps to bring new data science solutions to our customers and expanded our science team to position the company for exciting new growth opportunities in 2016.”

About Sphere of Influence, Inc.

Sphere of Influence fuses advanced data science with digital solutions to deliver transformative products.  The company specializes in advanced data analytics for high-volume, high-velocity and high-variety information assets from a wide range of sensors in precision agriculture, automotive, and Internet of Things (IoT) telematics.  The company utilizes a broad and continuously growing integrated infrastructure of proprietary data science platforms, algorithms, and machine learning systems.  For additional information, please visit Sphere of Influence’s corporate website at:  www.sphereoi.com.

View live release here.

 

 

Data Science to Stop Terrorist Counterfeiters

The U.S. Government has awarded Sphere of Influence, Inc. a contract to develop new technology that helps the U.S. Government understand more about terrorist networks that create forged identity documents.

Sphere of Influence, Inc., a McLean, Virginia based developer of advanced data analytics technologies, announced it has been awarded a contract by the U.S. Department of Defense (DoD) to build a data science platform that enables the U.S. Government to understand more about terrorist networks and forged identity documents they produce. The contract has an estimated value of $700k for one year. Under the terms of the contract, Sphere of Influence, Inc., will deliver technologies that apply advanced data science, computer vision, and machine learning algorithms.

With this contract the US Government will not only learn more about the networks that create counterfeit identity documents, but also how they use them.

About Sphere of Influence, Inc.

Sphere of Influence, Inc. provides technologies for advanced data analytics and interactive digital solutions. The company was formed in 2000 and is headquartered in McLean, VA.

View live release here.

If computers can beat Jeopardy! champions, why can’t they detect the insider threat?

The world was awed two years ago when IBM’s Watson defeated Jeopardy! champions Brad Rutter and Ken Jennings. Watson’s brilliant victory reintroduced the potential of machine learning to the public. Ideas flowed, and now this technology is being applied practically in the fields of healthcare, finance and education. Emulating human learning, Watson’s success lies in its ability to formulate hypotheses using models built from training questions and texts.

 

Three years ago, Army Private First Class Bradley Manning leaked massive amounts of classified information to WikiLeaks and brought to public awareness the significance of data breaches. In response to this and several other highly publicized data breaches, government committees and task forces established recommendations and policies, and invested heavily in cyber technologies to prevent such an event from reoccurring. Surely, we thought, if anyone had the motivation and resources to get a handle on the insider threat problem, it is the government. But, Edward Snowden, who caused the recent NSA breach, has made it painfully obvious how impotent the response was.

 

Lest we assume this is a just government problem, enormous evidence abounds showing how vulnerable commercial industry is to the insider. We are inundated with a flood of articles describing how malicious insiders have cost private enterprise billions of dollars in lost revenue, so why has no one offered a plausible solution?

 

The insider threat remains an unmitigated problem for most organizations, not because the technologies do not exist, but rather because the cyber defense industry is still attempting to discover the threat using a rules-based paradigm. Virtually all cyber defense solutions in the market today apply explicit rules, whether they are antivirus programs, firewalls with access control lists, deep packet inspectors, or protocol analyzers. This paradigm is very effective in defending against known malware and network exploits, but fails utterly when confronted with new attacks (i.e. “zero-days”) or the surreptitious insider.

 

In contrast, acknowledging that it was impossible to build a winning system that relied on enumerating all possible questions, IBM designed Watson to generalize and learn patterns from previous questions and use these models to hypothesize answers to novel questions. The hypothesis with the highest confidence was selected as the answer.

 

Like Watson, an effective technology to detecting the insider must adaptively learn historical network patterns and then use those patterns to automatically discover anomalous activity. Such anomalous traffic is symptomatic of unauthorized data collection and exfiltration.

 

Inspired by the WikiLeaks incident, Sphere’s R&D team has investigated machine learning algorithms that construct historical models by grouping users by their network fingerprints. As an example, without any rules or specifications, the algorithms learn that bookkeeping applications transmit a distinctive pattern that enables grouping accountants together, and HR professionals are grouped by the recruiting sites they visit. These behavioral models generalize normal activity and can be used as templates to detect outliers. While users commonly generate some outliers, suspicious users deviate significantly from their cohorts, such as the network administrator that accesses the HR department’s personnel records. Like Watson, the models allow the system to form hypotheses.

 

Applied to cyber security, every time an entity accesses the network, the algorithms hypothesize if the activity conforms to its model. If it does not conform, that activity is labeled an outlier. Because these methods use a statistical confidence that dynamically balances internal thresholds on network activities (e.g., sources and destinations, direction and amount of data transferred, times, protocols, etc.), it becomes extremely hard for a malicious insider to outsmart. Simply the fact that the system does not reveal its thresholds can have a significant deterrent effect.

 

A paradigm shift in cyber technologies is happening now. Cyber security professionals agree that preventing data breaches from a malicious insider is a difficult task, and the past suggests that next major breach will not be detected with existing rules-driven cyber defense solutions. Next generation cyber security technology developers must seek inspiration from IBM’s Watson and other successful implementations of machine learning before we can hope to prevail against the insider threat.

InsiderThreatDetectionSphere