Organizations want to empower their business analysts, database administrators and data scientists with a single, easy-to-use tool to directly develop, test and operationalize their business logic on raw data stored in multiple disparate data sources and in various structured, semi-structured and unstructured formats.
Xcalar Design expands the pool of users that work with data processing applications — those who prepare and integrate data for analytics, those who structure data for data warehousing, and those who train and apply machine learning models at cloud-scale — by providing them with a sophisticated visual studio and IDE with direct access to raw data hosted on any public or private cloud. With Xcalar Design, users have the ability to combine three paradigms — SQL, structured programming, and visual programming — within this single elegant studio that is powered by Xcalar Data Platform.
Xcalar Design - 10x increase in developer productivity
Xcalar Design Features
A sophisticated visual studio and IDE for enhanced develop > test > operationalize productivity
Interactively design your business logic
Xcalar Design has two modes of designing business logic to give users the flexibility to model in the mode of their choice. In SQL Mode, users can write SQL code into the SQL panel or copy-and-paste SQL from a legacy system. SQL can then be executed for results sets and schema changes to be immediately seen. In Advanced Mode, in addition to debugging their SQL, users can also interactively design their dataflows visually with point-and-click actions. They can interactively scroll through 100s of billions of rows in a spreadsheet-like interface, and can invoke relational operations. In either mode, users have access to writing user-defined functions in Python, which can be called from within SQL code.
Automatic schema detection
Users have the ability to point to any data in their data lake and Xcalar Design automatically detects the schema for their data. Users have the flexibility to make changes to the detected schema or even to copy-and-paste a new schema. If the schema changes at any point during operationalization, for example, when new fields get added to a JSON source file previously ingested, Xcalar Design gives users the ability to view these schema changes as violations that do not conform to the previous rules set by the user, and allows them to easily accommodate these changes in their algorithms.
Use SQL to query your source data
Xcalar Design enables users to easily create tables directly from the data lake with no prior ETL. Users can view all the tables created along with their schema, to design their SQL code. Users can keep track of all the queries run, along with their status and the associated dataflow graphs that are created for every query. SQL queries can be broken up into modules, and saved, so as to be shared or reused in other dataflows. When a SQL statement fails, Xcalar Design allows users to analyze their dataflow graph using Xcalar’s Advanced Mode to debug the problem. Here, users have the ability to test their algorithms and combine point-and-click operations with their SQL code, if needed.
Use relational operations through a point-and-click interface
Users have the ability to apply relational operations on their data through SQL code or they can switch to the advanced mode and apply relational operations with point-and-click actions. Join, union, group by, pivot, filter, aggregate, sort and merge operations on the entire source data can all be done through this easy-to-use interface. They can visually explore and interactively scroll through their data to apply relational operations or user-defined functions written in python.
Use the tightly integrated Jupyter Notebook interface for your machine learning and other Data Science work
Xcalar Design natively integrates with Jupyter Notebook, allowing users to easily integrate advanced analytics processes at every stage of the data pipeline. Whether modeling pipelines, classifying data using machine learning, or applying a predictive analytics algorithm, Xcalar gives data scientists the ability to carry out these operations quickly and easily while supporting familiar tools.
Write user-defined functions in Python
Dataflows can be extended via user-defined functions (UDFs) coded in Python. With Xcalar Design’s visual programming paradigm, and SQL, Python can also be used to directly invoke map functions, as well as ML algorithms imported from open source libraries like TensorFlow, Spark ML, H2O and others. Xcalar’s integration with Jupyter Notebook makes it easier for hands-on programmers to write UDFs. Data-savvy business analysts, Excel & Matlab power users, and data engineers can work at maximum efficiency with this flexible interface.
Work visually with dataflow graphs as modular building blocks
Dataflow graphs are created and amended automatically when users perform operations in Xcalar. Dataflow graphs are auditable. They show the path to the data source, the intermediate tables created, and the operators applied to these tables. Xcalar Design users can design their algorithms through building and reusing dataflow modules. They can examine the results at every step of the flow and optimize their algorithms. Users can drill down to schemas, distribution of data across nodes, and data skew. Dataflow graphs can be archived as graphics and can be exported as JSON files.
Operationalize your algorithms
Dataflows can be operationalized and scheduled to run as batch executions. This enables users to deploy their analytics pipeline, including algorithms, transformations, and models, at cloud-scale in a secure production environment. Data scientists, business analysts and data engineers can use Xcalar Design to create dataflows, and then operationalize them with a few mouse clicks across petabytes of raw data. Complex relational operations run with linear scale-out performance. Batch dataflows can be saved, shared across clusters, loaded, parameterized, and scheduled to run.
Detect data anomalies and integrity constraint violations
Xcalar can read and analyze any arbitrary data, including data that does not conform to the expected format, encoding, type, and attributes. Users that want to have their data conform to specific business rules, have the power to integrate such rules into Xcalar. All the constraint violations within the data or on the operations done on the data pipeline (or algorithm) are captured in Integrity Constraint Violation (ICV) tables. Users can generate the ICV table through a point-and-click action in the UI. This enables users to find data quality issues dynamically while building a query or data pipeline.
Point-in-time data views
The Xcalar Design IMD (insert, modify and delete) feature allows users to analyze data that is continuously changing. Xcalar Design makes it easier for users to keep track of the transformations and manipulations applied to their tables, through a visual timeline of inserts, modifications and deletes. Users can traverse back visually to view data at any point-in-time on this timeline or they can view live up-to-date data.
Extend Xcalar Design functionality with plug-n-play ease
Use the Admin panel for security and control
Xcalar Design provides a very easy-to-use interface for administrators to monitor the status of the Xcalar Compute Engine and to specify the environment in which users in the cluster will use Xcalar Design. Administrators can view and control the memory usage in the cluster and can access all logs with a single click.
Monitor your Xcalar cluster at any time
Xcalar Design includes monitoring tools that display the progress of jobs and the status of the Xcalar cluster. For each operation, the progress, start and elapsed times, and JSON describing the operation performed are shown. Also, users can view cluster statistics, including memory usage, swap file usage, CPU utilization, and network traffic for all nodes in the Xcalar cluster.