Sybase Technical Library - Product Manuals Home
[Search Forms] [Previous Section with Hits] [Next Section with Hits] [Clear Search] Expand Search

Bulk copying data into partitioned tables [Table of Contents] Using the bcp options

Utility Guide

[-] Chapter 3 Using bcp to Transfer Data to and from Adaptive Server
[-] bcp performance
[-] Using parallel bulk copy to copy data into a specific partition

Using parallel bulk copy to copy data into a specific partition

Use parallel bulk copy to copy data in parallel to a specific partition. Parallel bulk copy substantially increases performance during bcp sessions because it can split large bulk copy jobs into multiple sessions and run the sessions concurrently.

To use parallel bulk copy:

Figure 3-1 illustrates the parallel bulk copy process.

Figure 3-1: Copying data into a partitioned table using parallel bulk copyraster

See the Adaptive Server Enterprise Performance and Tuning Guide for information about partitioning a table.

When using parallel bulk copy to copy data out, you cannot specify which partitions bcp should use.

bcp in and locks

When you copy in to a table using bcp, and particularly when you copy in to a table using parallel bcp, the copy process acquires the following locks:

If you are copying in very large tables, and especially if you are using simultaneous copies into a partitioned table, this can involve a very large number of locks.

To avoid running out of locks:

Parallel bulk copy methods

Use one of the following methods to copy in data using parallel bulk copy:

Parallel bulk copy syntax

The syntax for parallel bulk copy is:

bcp table_name[:partition_number] in file_name -Pmypassword 

where:

Using parallel bulk copy on partitioned tables

To copy sorted data in parallel into a specific partition:

bcp copies each file or set of line numbers to a separate partition. For example, to use parallel bulk copy to copy in sorted data to mydb..bigtable from four files into four partitions, enter:

bcp mydb..bigtable:1 in file1 -Pmypassword -c &
bcp mydb..bigtable:2 in file2 -Pmypassword -c &
bcp mydb..bigtable:3 in file3 -Pmypassword -c &
bcp mydb..bigtable:4 in file4 -Pmypassword -c &

Parallel bulk copy and IDENTITY columns

When you are using parallel bulk copy, IDENTITY columns can cause a bottleneck. As bcp reads in the data, the utility both generates the values of the IDENTITY column and updates the IDENTITY column's maximum value for each row. This extra work may adversely affect the performance improvement that you expected to receive from using parallel bulk copy.

To avoid this bottleneck, you can explicitly specify the IDENTITY starting point for each session.

Retaining sort order

If you copy sorted data into the table without explicitly specifying the IDENTITY starting point, bcp might not generate the IDENTITY column values in sorted order. Parallel bulk copy reads the information into all the partitions simultaneously and updates the values of the IDENTITY column as it reads in the data.

A bcp statement with no explicit starting point would produce IDENTITY column numbers similar to those shown in Figure 3-2:

Figure 3-2: Producing IDENTITY columns in sorted orderraster

The table has a maximum IDENTITY column number of 119, but the order is no longer meaningful.

If you want Adaptive Server to enforce unique IDENTITY column values, you must run bcp with either the -g or -E parameter.

Specifying the starting point from the command line

Use the -g id_start_value flag to specify an IDENTITY starting point for a session in the command line.

The -g parameter instructs Adaptive Server to generate a sequence of IDENTITY column values for the bcp session without checking and updating the maximum value of the table's IDENTITY column for each row. Instead of checking, Adaptive Server updates the maximum value at the end of each batch.

Warning!

Be cautious about creating duplicate identity values inadvertently when you specify identity value ranges that overlap.

To specify a starting IDENTITY value, enter:

bcp [-gid_start_value]

For example, to copy in four files, each of which has 100 rows, enter:

bcp mydb..bigtable in file1 -g100
bcp mydb..bigtable in file2 -g200
bcp mydb..bigtable in file3 -g300
bcp mydb..bigtable in file4 -g400

Using the -g parameter does not guarantee that the IDENTITY column values are unique. To ensure uniqueness, you must:

Specifying the starting point using the data file

Use the -E parameter to set the IDENTITY starting point explicitly from the data file.

The -E parameter instructs bcp to prompt you to enter an explicit IDENTITY column value for each row. If the number of inserted rows exceeds the maximum possible IDENTITY column value, Adaptive Server returns an error.


Bulk copying data into partitioned tables [Table of Contents] Using the bcp options