Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

invito all'azione

Get Started

cloud

Sei pronto per cominciare?

Scarica Sandbox

Come possiamo aiutarti?

chiudiPulsante di chiusura
invito all'azione

Enterprise Spark Big Data Solutions At Scale

HORTONWORKS DELIVERS SPARK FOR ENTERPRISE DEPLOYMENTS

cloud Hortonworks is a leader. Read the Forrester Wave.

DOWNLOAD Report

Overview

Apache™ Spark Overview

Hortonworks is unleashing the power of the Apache Spark big data processing framework for enterprise scale, unifying the capabilities of open enterprise Apache Hadoop® and the in-memory analytic capabilities of Apache Spark to maximize organizational value.

Spark is Better as Part of the Platform
Spark is certified as YARN-ready and is part of Hortonworks Data Platform. Memory and CPU-intensive enterprise Spark-based applications can coexist with other workloads deployed in a YARN-enabled cluster. Spark has first class support for external data sources, it can run directly on the cluster in YARN, and that is where enterprises want to perform their data analysis. This approach avoids the need to create and manage dedicated enterprise Spark clusters and allows for more efficient resource use within a single cluster. 

Spark Requires Enterprise-Grade Security and Governance
As part of the HDP platform, Spark has access to the same governance, security and management policies as other components of the HDP stack. The Spark big data processing framework is one the fastest moving projects in the Big Data ecosystem and its libraries remain at different levels of maturity. Hortonworks investigates, validates, certifies and then supports each of the components in the Spark project. This approach is key to the way we add value for our customers.

Notebooks Makes Spark and Data Science Easier to Consume & Share
Web-based notebooks bring data ingestion, exploration, visualization, sharing and collaboration capabilities to Hadoop and Spark. Hortonworks is making a substantial investment in Apache Zeppelin; we plan to make Zeppelin ready for production use by making it easier to use, while adding security, stability and R support.

By delivering a unified Apache Spark and Hadoop, we combine Spark-driven Agile Analytic workflows with the vast-data set and economics of Hadoop. With Hortonworks, enterprises can deploy the Apache Spark big data processing framework with the industry’s best security, governance, and operations capabilities.

WHAT IS HORTONWORKS' FOCUS ON SPARK?

With the release of Spark 1.6, Hortonworks commits to helping customers accelerate data science, maintain seamless data access, drive innovation at the core.

Spark as part of open enterprise Hadoop, empowers organizations to scale Spark, for enterprise value.

amministratore

Data Science Acceleration

Improving data science productivity by enhancing Apache Zeppelin and by contributing additional Spark algorithms and packages to ease the development of key solutions.

For example: Project Magellan - Geospatial analytics in Apache Spark, an open source library for geospatial analytics that facilitates geospatial queries and builds upon Spark to solve hard problems dealing with geospatial data at scale.

amministratore

Seamless Data Access

Spark SQL provides a SQL and Data Frame APIs to access structured data while Spark Streaming enables developers to easily build scalable, high-throughput, fault-tolerant stream processing of live data streams.

Hortonworks has been improving Spark’s integration with YARN, HDFS, Hive, HBase and ORC. Specifically, we believe that we can further optimize data access via the new Data Source API.

amministratore

Innovate at the Core

Enable RDD sharing with the HDFS Memory Tier

Contribute additional machine learning algorithms

Enhance enterprise Spark’s security, governance, operations, and readiness

invito all'azione

To learn more about all the exciting Spark innovation,

CHECK OUT OUR APACHE SPARK PAGE.

VIEW PAGE

HOW TO GET STARTED WITH APACHE SPARK AT SCALE?

Listen to our recent webinar - Spark at Scale with Hadoop