Optimizing memory usage

Xcalar Design displays a low-memory notification when Xcalar reaches a pre-defined level of memory consumption. It is important for you to optimize memory usage to ensure that Xcalar can run at the highest performance level possible. See Understanding memory usage for more information about the low-memory notification and what steps to take in response to the notification.

Best practices for conserving memory

Xcalar Design users can follow these best practices to conserve memory:

  • Reduce the number of tables.
  • Reduce the number of fields in the Data Browser.

    NOTE: Fields that come from your dataset cannot be removed individually. For example, if your dataset has two fields: Carrier and Description, you cannot remove only the Carrier field and keep the Description field to try to reduce the number of fields.

More detailed information about these best practices are provided in the sections that follow.

Effects of tables on memory consumption

Each table in your workbooks consumes memory regardless of the table's status, which can be active, hidden, or temporary. Keeping only the tables required for modeling your queries helps conserve memory.

Methods for dropping tables

You can drop tables with these methods:

  • Display the Tables panel from the Worksheets icon. Then select the tables to drop. See Understanding and changing table statuses for more information about the Tables panel.
  • In the dataflow graph, right-click a table icon to display a pop-up menu. Select Drop Table to remove the table from the worksheet. This method enables you to see how the table was created or modified when you performed the modeling.
  • In the active worksheet, locate the table to drop, and right-click the table title bar to display a drop-down menu. Then select Drop Table.
  • Click Monitor in the toolbar and then the System icon. In the System panel, click the Release Memory icon to display the Drop Tables modal window, which lists all tables in your workbook and the amount of memory used by each . See Understanding memory usage for more information about the Drop Tables modal window.

Effects of table columns on memory used

The number of columns in a table has no significant effect on memory consumption. For example, from the same dataset, you can create two tables: one with 2 columns and one with 15 columns. Both tables use the same amount of memory, which is 16 MB. The following screenshots show the columns in the tables and the amount of memory used by each table.

Because the number of columns is not correlated with memory consumption, you cannot free up memory by deleting table columns.

Effects of fields in the Data Browser on memory used

Each table has a Data Browser associated with it. The Data Browser lists all the fields that are already in the table or can potentially be added to the table. Initially, each table you create from a data source consumes the same amount of memory as each table's Data Browser contains only the fields from the data source. These fields are listed in the prefixed fields section of the Data Browser. This group of prefixed fields consumes a fixed amount of memory, regardless of how many fields are in the group.

After you perform operations on a table, derived columns are created in the resultant table and these columns are also added in the resultant table's Data Browse. They as derived fields. Each derived field consumes additional memory.

NOTE: A table resulting from a batch dataflow does not have prefixed fields. The Data Browser for this table consists of only derived fields.
Example: A two-column table is created from a dataset with two fields: Destination and Distance. If no operations have created columns in this table, the table uses 2.2 MB of memory. Suppose you change the Distance column's data type to integer. A new table is created with a derived column named Distance_integer. In the Data Browser for this table, the field Distance remains in the prefixed fields section, and a new field named Distance_integer is added in the derived fields section. If you use the Drop Tables modal window to check memory usage, you can see that the new table consumes 4.2 MB instead of 2.2 MB of memory because of the derived field added. The more columns your operations create, the more derived fields are added to the Data Browser. The derived fields increase the amount of memory consumed.
NOTE: As in the case of other columns, deleting a derived column does not free up memory. For example, if you hide the Distance_integer column in the example above, the table continues to use 4.2 MB of memory. Memory used is not reduced because the derived field, Distance_integer, remains in the Data Browser.

Removing fields from the Data Browser

Because the number of fields in the Data Browser has an impact on memory used, the logical step to conserve memory is to reduce the number of fields in the Data Browser. However, remember that in the context of memory consumption, the collection of fields from the source dataset (that is, the prefixed fields) is considered one unit. You cannot remove some fields and keep some fields in that collection. You can decide either to keep or to eliminate the collection in its entirety.

The contents of the Data Browser are not directly editable. To change the contents, use the Projection Mode feature, which enables you to keep only columns of interest in a table and to remove unnecessary ones from the Data Browser.

NOTE: The term "projection" originates from relational algebra. Its meaning in Xcalar Design should be clearly distinguished from its meaning in daily usage.

Understanding Projection Mode

Projection Mode in the Data Browser enables you to eliminate unnecessary fields that Xcalar Design keeps in memory for a table. You can choose to keep only the fields that interest you in the table and in the Data Browser. In doing so, you can reduce the amount of memory used by the table.

Using Projection Mode

Follow these steps to use Projection Mode to keep memory consumption low for a table:

  1. Double click anywhere in the DATA column of the table to display the Data Browser.
  2. Click the down arrow in the Data Browser to select Projection Mode. The following screenshot shows the location of the arrow.

  3. Click the check box preceding the field name if you want to keep the field in the resultant table. For example, click DayOfWeek_integer if you want to keep it in the resultant table. You might have to use the scrollbar to view all field names. The following screenshot is an example of selecting the fields to keep in the Data Browser.

  4. Click the submit projection button. A new table is created with only the columns you chose in the Data Browser. As in the case of other operations, you can use the Undo button to revert the table to the previous version before the projection.

You can perform projections on any columns to reduce the number of columns in the active table so that the table is less cluttered. However, the main purpose of projection is to optimize memory usage. Therefore, pay attention to the number of fields being eliminated from the Data Browser when you decide which column to choose for projection.

Example of projection on derived fields

The Data Browser shows 19 fields, and 4 fields are derived fields such as DayOfWeek_integer, DepTime_integer, DepTime_integer, and so on. These derived fields are created due to operations, such as Map. The rest (15 fields) are from the collection of fields in the source dataset. They are MonthDayYear, Distance, DepTime, and so on. The names of these fields have a prefix, such as airlines.1.

Suppose you perform a projection on DayOfWeek_integer and DepTime_integer, the resultant table contains only these columns. All fields except for DayOfWeek_integer and DepTime_integer are eliminated from the Data Browser. The resultant table now consumes much less memory than the table before the projection. For the resultant table, the Xcalar cluster no longer needs memory to track all 4 derived fields and the collection of fields from the data source.

Example of projection on prefixed fields

You cannot choose individual entries from the list of prefixed fields. The set of prefixed fields is eliminated or retained as one unit through projection. The prefixed fields consume a fixed amount of memory, which is about the same as that consumed by a derived field.

Suppose you are only interested in airlines1::Distance and airlines1::MonthDayYear, you cannot project these fields and discard other prefixed fields such as airlines1::Carrier and airlines1::FlightNum. The Data Browser must keep the entire collection of prefixed fields. After the projection, the unselected derived fields no longer exist in the table or the Data Browser. This frees up memory that otherwise would be needed to track 4 derived fields.

Go to top