Close
  • Latest News
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Database
    • Database

    IBM Launches Apache Spark-Based Data Science Experience

    By
    Darryl K. Taft
    -
    June 7, 2016
    Share
    Facebook
    Twitter
    Linkedin
      IBM big data

      IBM has launched a new application for Apache Spark called the Data Science Experience, which the company is referring to as the first enterprise application for Apache Spark.

      In an interview with eWEEK, Ritika Gunnar, vice president of Offering Management for IBM Analytics, described the new IBM Data Science Experience as a cloud-based development environment for real-time, high-performance analytics that gives data scientists and developers the ability to access and ingest vast amounts of data and deliver new business insights.

      IBM made the announcement at Spark Summit 2016 in San Francisco. Last year at this event, IBM initiated a $300 million investment in making Spark the analytics operating system for the company’s big data efforts. This move builds on the $300 million investment. Gunnar would not put a price tag on this year’s launch, but said it was but a “drumbeat” among many more to come.

      The new Data Science Experience will run on IBM’s Bluemix cloud development platform and will simplify the work of embedding data and machine learning into cloud applications, Gunnar said.

      “We are enabling the data science community to build machine learning applications very efficiently by leveraging Spark,” Gunnar told eWEEK. “We made a big commitment last year into development resources and into investment in core open-source Apache Spark because we believed it was transforming how analytics were being run across businesses.

      IBM not only put development investment into the open componentry itself, but over the past year, the company built well over 30 internal applications on Apache Spark, she said.

      IBM interviewed a large number of data science professionals and concluded that data science as practiced today is an “individual sport,” Gunnar said. With the Data Science Experience, IBM is attempting to make it a team sport, she noted. This is particularly important as businesses try to use more forms of data across more parts of organizations to make faster, better business decisions.

      There is a shortage of data science skills in the market today. Many people aspire to be data science professionals, and as a result, there is a huge need for them to be able to understand the science better using practical, real-world examples. IBM’s new project gives data science professionals the ability to learn, create and collaborate.

      So one of the first things the Data Science Experience provides is content to help data scientists understand what they need to know, and also provides a platform to help them create algorithms, solutions and insights that can be built on open tooling. The open nature of the project shows through here, because no matter what language the users learned on, IBM will help them get started creating insights from whatever data they have.

      “And to make it a team sport, we enable you to share what you’ve learned or what you’ve built with the rest of the data science community through what we’re calling an exchange,” Gunnar said. “The Data Science Experience enables organizations to collaborate across what they build. Sharing and collaborating is a big part of what we do.”

      The Data Science Experience brings together content, data, models, and open-source resources from IBM and others, including H2O, RStudio and Jupyter Notebooks on Apache Spark. Moreover, any model built in the Data Science Experience can be extended to real-world applications through IBM MobileFirst, the Web, IBM’s Internet of things (IoT) technology or IBM’s cognitive solutions.

      In related news, IBM announced that it joined the R Consortium to advance the R programming language for data science applications. With that move, IBM is extending the agility of Spark to more than 2 million members of the R community through new contributions to SparkR, SparkSQL and Apache SparkML.

      IBM Launches Apache Spark-Based Data Science Experience

      “With Apache Spark, we see an opportunity to drive significant innovation into the community to benefit data engineers, data scientists and application developers,” Bob Picciano, senior vice president of IBM Analytics, said in a statement. “Our IBM Analytics platform is designed for blending those new technologies and solutions into existing architectures. It’s ready-made to take advantage of whatever innovations lie ahead as more and more data scientists around the globe create solutions based on Spark.”

      Indeed, one such upcoming innovation from IBM, or another “drumbeat,” will come later this year when IBM delivers a new platform for enterprises to be able to consume the insights generated from using the Data Science Experience. This is one of a series of announcements that IBM will be making to be able to build out a broader vision to help clients realize more value from data, Gunnar said.

      “The Data Science Experience qualifies as the next step in IBM’s long-term investments in Apache Spark and should pay substantial dividends for critical Spark constituencies,” said Charles King, principal analyst at Pund-IT.

      In a press release, IBM said the company forges partnerships with data science organizations, such as Galvanize, Lightbend and RStudio. In addition, IBM has built Spark into the core of its platforms, including Watson, IBM Commerce, as well as its analytics, systems and cloud solutions as well as and more than 30 offerings, such as IBM BigInsights for Apache Hadoop, IBM Analytics on Apache Spark, Spark with Power Systems, Watson Analytics, SPSS Modeler and IBM Stream Computing. IBM also open-sourced its SystemML machine learning technology to advance Spark’s machine learning capabilities in 2015.

      Users such as USA Cycling are employing IBM’s Spark technology. USA Cycling Women’s Team Pursuit is using IBM Spark, Watson IoT, as well as IBM mobile and cloud solutions to gain insights for training strategies and racing tactics. The team can now get advanced analysis of rider data, calculate dynamic race positioning and determine the grouping of riders over the race track.

      Meanwhile, IBM also continues to grow its analytics ecosystem and has contributed to related projects, including Apache Toree, EclairJS, Apache Quarks, Apache Mesos and Apache Tachyon (now called Alluxio), and providing major contributions to Apache Spark sub-projects SparkSQL, SparkR, MLLib and PySpark with more 3,000 total contributions in the last year, IBM officials said.

      “Just as IBM played a critical role in the development of computer science, we can see many similarities today,” Picciano said in a statement. “Computer science went mainstream with the introduction of the PC. With data science, the major roadblock is having access to large data sets and having the ability to work with so much data. With today’s announcement, clients can have both.”

      Indeed, this IBM move is about making data science available for the masses, Gunnar said. “This is about enabling a foundation to the masses that then bridges to a cognitive system,” she noted. “We fully anticipate through this offering being able to grow the number of data science professionals that are out in the market.”

      IBM is trying to transform how companies use data all across their organizations. The company is trying to enable organizations to take data from IT and be able to activate the developer, the data science professional and the line-of-business professional to have data at their fingertips for them to make decisions that transform the business in ways they didn’t think about doing before.

      “That means that you have to be more agile and collaboration is key,” Gunnar said. “This notion of team sport is pivotal to that—through things like Watson Analytics and the Data Science Experience.”

      Darryl K. Taft
      Darryl K. Taft covers the development tools and developer-related issues beat from his office in Baltimore. He has more than 10 years of experience in the business and is always looking for the next scoop. Taft is a member of the Association for Computing Machinery (ACM) and was named 'one of the most active middleware reporters in the world' by The Middleware Co. He also has his own card in the 'Who's Who in Enterprise Java' deck.

      MOST POPULAR ARTICLES

      Cybersecurity

      Visa’s Michael Jabbara on Cybersecurity and Digital...

      James Maguire - May 17, 2022 0
      I spoke with Michael Jabbara, VP and Global Head of Fraud Services at Visa, about the cybersecurity technology used to ensure the safe transfer...
      Read more
      Cloud

      Yotascale CEO Asim Razzaq on Controlling Multicloud...

      James Maguire - May 5, 2022 0
      Asim Razzaq, CEO of Yotascale, provides guidance on understanding—and containing—the complex cost structure of multicloud computing. Among the topics we covered:  As you survey the...
      Read more
      Big Data and Analytics

      GoodData CEO Roman Stanek on Business Intelligence...

      James Maguire - May 4, 2022 0
      I spoke with Roman Stanek, CEO of GoodData, about business intelligence, data as a service, and the frustration that many executives have with data...
      Read more
      IT Management

      Intuit’s Nhung Ho on AI for the...

      James Maguire - May 13, 2022 0
      I spoke with Nhung Ho, Vice President of AI at Intuit, about adoption of AI in the small and medium-sized business market, and how...
      Read more
      Android

      Samsung Galaxy XCover Pro: Durability for Tough...

      Chris Preimesberger - December 5, 2020 0
      Have you ever dropped your phone, winced and felt the pain as it hit the sidewalk? Either the screen splintered like a windshield being...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2021 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×