Using Xcalar Design with Jupyter

Xcalar Design supports the Jupyter Notebook, enabling you to write code directly in the browser to interactively analyze data obtained from a table.

NOTE: This topic assumes that you are familiar with Jupyter. For more information about the Jupyter user interface for running code, generating documents, and so on, see https://jupyter-notebook.readthedocs.io).

The term notebook refers to the Jupyter Notebook.

Programming language supported

While Jupyter can run code written in various programming languages, the Jupyter Notebook accessed from Xcalar Design supports only Python 2.7.

Relationship between a notebook and a Xcalar Design workbook

You can create more than one notebook for each workbook. Remember that the purpose of opening a notebook is to use Jupyter to analyze data from a table. Therefore, Xcalar recommends that you do not open a notebook containing code that connects the notebook to another workbook. Otherwise, you do not get the expected results from running the code in the notebook.

EXAMPLE: Your currently active workbook is named Workbook1. Suppose you created a notebook earlier named MyNotebook2 when a workbook named Workbook2 was active, using MyNotebook2 for data from Workbook1 might cause unexpected results. To work on data from Workbook1, create a new notebook specifically for Workbook1.

If a notebook does not contain any code that connects to a workbook, you can use the notebook with any workbook.

Naming a notebook

After you create a notebook, follow these Jupyter naming conventions to name the notebook:

  • The name must contain at least one character.
  • These characters are disallowed: colon (:), forward slash (/), and backslash (\).
  • The name is case-sensitive.
Recommendation: Because it is important to open the appropriate notebook for the active workbook, always name your notebook descriptively so that you can easily relate a notebook to a workbook. For example, for a workbook named Airlines, you can create notebooks named AirlinesNB1, AirlinesNB2, and so on. If the notebooks are named untitled1 and untitled2, it is not obvious that the notebooks are created for the Airlines workbook.

Starting the Jupyter Notebook

To start a notebook, click the Jupyter Notebook icon () in the Xcalar Design menu. The JUPYTER NOTEBOOK window is displayed.

Depending on whether you have an open notebook, the JUPYTER NOTEBOOK window shows one of the following:

  • The Jupyter dashboard with a file tree, which lists the notebooks created for all workbooks. You can manage your notebooks, open a notebook for editing, or upload a notebook.
  • The notebook editor with the contents of the open notebook. You can edit the contents as you normally would in Jupyter.

    IMPORTANT: Do not edit the Xcalar starter code, which is necessary for Xcalar Design to connect to the notebook. The code is in the first cell of the notebook. If you accidentally change the code, you can restore the code by following the instructions in Connecting a notebook to the active workbook.

    The following screenshot shows an example of the code:

In addition to managing or editing your notebook, you can perform the following tasks in the JUPYTER NOTEBOOK window:

  • Connect a notebook to the active workbook.
  • Generate a code snippet for a Map UDF.
  • Generate a code snippet for an import UDF.
  • Test an import UDF against a file. (This feature is available in a future release.)

What happens if you open a notebook not connected to the active workbook

If you open a notebook created from a workbook that is not the active workbook, a warning message is displayed, which shows the name of the expected workbook. Click CLOSE to dismiss the warning message, and you can continue working with the notebook. However, to avoid unexpected results in the future, Xcalar recommends that you complete one of the following procedures, depending on your reason for opening the notebook.

If you open the notebook by mistake

If you open an unintended notebook, follow these steps by using the Jupyter user interface:

  1. Close the notebook.
  2. Open a notebook corresponding to the currently active workbook.

If you open the correct notebook

If the notebook is indeed the one you want to use, switch to the workbook that is shown as the expected workbook in the warning message. For information about switching to another workbook, see Workbook Browser. After the correct workbook becomes active, you can use the notebook with the tables in the workbook as usual.

If you want to connect the notebook to the active workbook

You might have created the notebook from a workbook other than the one that is active. If from now on, you want to use the notebook with the active workbook, see Connecting a notebook to the active workbook.

Connecting a notebook to the active workbook

After you open a notebook, you can connect it to the currently active workbook, regardless of which workbook you were using when you created the notebook.

You can also use the following procedure to re-establish the connection to the active workbook. For example, if you have accidentally modified the Xcalar starter code and you want to restore it to its original form, you can complete the procedure for Xcalar Design to re-enter the code in the notebook. After the code is written, delete the cell in which you made the unintended changes.

NOTE: You can establish a connection between a notebook and the active workbook. You cannot connect a notebook to any other workbook.

To establish the new connection, follow these steps:

  1. At the top of the JUPYTER NOTEBOOK window, click CODE SNIPPETS to display a drop-down menu.
  2. Click Connect to Xcalar Workbook in the drop-down menu. A new cell is created in the notebook, which contains the starter code for connecting the notebook to the active workbook.

    IMPORTANT: If the notebook has a cell with starter code for connecting to a workbook other than the active workbook, you must delete that cell. A notebook with connections to multiple workbooks does not work properly.

Using a UDF template

Xcalar Design provides you with two templates for UDFs: one for UDFs that you run as Map functions, and one for import UDFs. After you start the UDF template, you can use the Jupyter notebook editor to modify the code, which you can paste in the Xcalar Design User Defined Function panel.

Follow these steps to use a UDF template:

  1. Open a notebook and place the insertion point in a notebook cell.
  2. At the top of the JUPYTER NOTEBOOK window, click CODE SNIPPETS to display a drop-down menu.
  3. Click Map UDF to display the template for a UDF that you run as a Map function. Click Import UDF to display the template for an import UDF.
  4. In the UDF Template modal window, fill out the fields, which will be used in the generated code snippet in the template. You can modify the code later.
  5. After the code is displayed in the notebook cell, use the notebook editor to make appropriate changes to the code.
  6. Click Send to UDF Editor at the bottom of the notebook cell. The code is pasted to the Xcalar Design User Defined Function panel. For more information about the actions you can take with the UDF code in the panel, see Using UDFs.

Publishing data from a table to a notebook

You can publish a table or a partial table to a notebook. Then you can use Jupyter for data analysis or visualization for the published rows.

RECOMMENDATION: Publish no more than 100 table rows to a notebook at a time. Publishing more than 100 rows might take a long time.
  1. Open a notebook that is connected to the active workbook.

    NOTE: If a notebook is not open when you publish a table, Xcalar Design creates a notebook for you.
  2. In the workbook, display the worksheet containing the table that you want to publish to the notebook. Locate the active table for publishing.
  3. Click the table title bar or the triangle in the lower-right corner of the table title bar to display a drop-down menu.
  4. Click Publish to Jupyter in the drop-down menu.
  5. To publish the entire table, select Full Table. To publish a partial table, enter the number of rows in the Partial Table field. If you enter a number, press Enter to submit it. For example, if you enter 10, the first 10 rows are published to the notebook.
  6. If the columns have prefixed names, a modal window is displayed for you to rename your columns before publishing. You can remove the prefix for each column name. After the renaming, click PUBLISH.

Go to top