We are excited to announce the general availability of Hortonworks Data Analytics Studio (DAS), a new service that improves the productivity of business analysts by delivering faster insights from data at scale. DAS also enables IT teams to meet the requirements of business by optimizing cluster and storage operation. DAS makes business analysts self-sufficient and enhance their efficiency by providing them with diagnostic information, operational tools, and intelligent recommendations. At the same time, IT teams can use DAS to proactively resolve cluster issues by getting visibility into cluster utilization through intuitive tools.
DAS is delivered as part of the Hortonworks DataPlane Service (DPS). DPS is a services platform that enables businesses to discover, manage, govern and now optimize their data spread across hybrid environments. The addition of DAS to DataPlane is an important step forward in enabling Hortonworks customers to manage their data effectively across clusters and environments whether they are on-premises or in the cloud.
The driving force behind DAS is to address the disparity that currently exists between lines of business and the IT teams that supports them. It is not uncommon for an IT team of 10 to 15 persons to operate and maintain big data environment and support 100-plus business analysts. Due to this disparity, it is virtually impossible for IT to scale to the requirements of the business. The process currently implemented in many companies requires IT personnel to manually review Hive logs for each query, evaluate its syntax, determine which tables and columns are being utilized for the query and then come with a specific set of recommendations to improve the query performance. This is a manual process that is characterized by multiple trips between IT and business analysts which leads to long delays, lower satisfaction and missed SLAs.
DAS leverages open-source technologies such as Apache Hive to share and extend the value of a modern data architecture in heterogeneous environments. It helps infrastructure administrators manage and optimize the performance of their Hive workloads by delivering visibility into query patterns and storage hotspots. DAS improves performance by uncovering inhibitors to query speed as well as providing recommendations to improve its efficiency.
In the past, Hive view did not provide full auto-complete capability during authoring time. We’ve addressed this shortcoming in DAS. This is not a trivial task especially on large databases, however through a number of caching optimizations we were able to make it work smoothly even with thousands of tables.
DAS provides specific recommendations to business analysts in order to optimize the performance of the queries on Hive tables based on heuristic recommendation engine. Performing optimization is difficult but DAS provides automated recommendations in simple language so that business analysts can self optimize their queries.
DAS enhances the productivity of the business analysts by automating routine tasks through out-of-the-box functionality. Analysts can compose queries using an intuitive query composer. The auto-complete functionality in DAS suggests SQL commands, keywords, and table columns based on the context to help edit queries faster. Analysts can also preview a few rows from the table and they have the flexibility to save queries in order to view and edit them later.
DAS provides IT teams with visibility into cluster utilization through intuitive database heat map. IT teams can easily view which columns and tables are used for joins and make changes to the data layout to proactively optimize the performance of the query with different search criteria.
With DAS, IT teams can quickly narrow down problematic queries in a large cluster through pre-defined reports and searches. Analysts can search for queries executed on Hive tables in a database and further refine the search based on parameters such as status of the query, queue to which the query belongs, the user of the query, tables read and written for the query, and execution modes.
This is just the start of the journey for DAS. Stay tuned for richer functionality in the future including scheduled queries, simple visualizations and more analytical reports of warehouse usage.
DAS is generally available and is the fourth application to be available as part of the Hortonworks DataPlane Service, an enterprise-grade platform for global data management.
For more information about DAS, please visit https://hortonworks.com/products/dataplane-apps/data-analytics-studio/.
DAS documentation can be accessed from https://docs.hortonworks.com/HDPDocuments/DAS/DAS-1.0.0/index.html