Sybase Technical Library - Product Manuals Home
[Search Forms] [Previous Section with Hits] [Next Section with Hits] [Clear Search] Expand Search

bcp modes [Table of Contents] Using the bcp options

Utility Guide

[-] Chapter 3 Using bcp to Transfer Data to and from Adaptive Server
[-] bcp performance

bcp performance

Keeping indexes and triggers on a table causes the bulk copy utility to use slow bcp automatically. However, slow bcp can fill the transaction log very quickly.

To improve the performance of bcp:

Using fast or slow bcp

The existence of indexes and triggers on tables affects transfer speed. When you use bcp on such tables, bcp automatically uses its slow mode, which logs data inserts in the transaction log. These logged inserts can cause the transaction log to become very large.

To control this data excess and ensure that the database is fully recoverable in the event of a failure, you can back up the log with dump transaction.

bcp does not fire any trigger that exists on the target table.

Fast bcp logs only the page allocations. For copying data in, bcp is fastest if your database table has no indexes or triggers.

However, if you used fast bcp to make data inserts, which fast bcp does not log, you cannot back up (dump) the transaction log to a device. The changes are not in the log, and a restore cannot recover nonexistent backup data. The requested backup (dump transaction) produces an error message that instructs you to use dump database instead. This restriction remains in force until a dump database successfully completes.

For more information about dump database and dump transaction, see the Adaptive Server Enterprise System Administration Guide, and the Adaptive Server Enteprise Reference Manual.

Copying tables with indexes or triggers

The bcp program is optimized to load data into tables that do not have indexes or triggers associated with them. It loads data into tables without indexes or triggers at the fastest possible speed, with a minimum of logging. Page allocations are logged, but the insertion of rows is not.

When you copy data into a table that has one or more indexes or triggers, a slower version of bcp is automatically used, which logs row inserts. This includes indexes implicitly created using the unique integrity constraint of a create table statement. However, bcp does not enforce the other integrity constraints defined for a table.

By default, the select into/bulkcopy/pllsort option is false (off) in newly created databases. To change the default situation, turn this option on in the model database.

The log can grow very large during slow bcp because bcp logs inserts into a table that has indexes or triggers. After the bulk copy completes, back up your database with dump database, then truncate the log with dump transaction after the bulk copy completes and after you have backed up your database with dump database.

While the select into/bulkcopy/pllsort option is on, you cannot dump the transaction log. Issuing dump transaction produces an error message instructing you to use dump database instead.

Warning!

Be certain that you dump your database before you turn off the select into/bulkcopy/pllsort flag. If you have inserted unlogged data into your database, and you then perform a dump transaction before performing a dump database, you will not be able to recover your data.

Fast bcp runs more slowly while a dump database is taking place.

Table 3-1 shows which version bcp uses when copying in, the necessary settings for the select into/bulkcopy/pllsort option, and whether the transaction log is kept and can be dumped.

Comparing fast and slow bcp

select into/bulkcopy/pllsort

on

off

fast bcp (no indexes or triggers on target table)

OK

dump transaction prohibited

bcp prohibited

dump transaction

slow bcp (one or more indexes or triggers)

OK

dump transaction prohibited

OK

dump transaction OK

The performance penalty for copying data into a table that has indexes or triggers in place can be severe. If you are copying in a very large number of rows, it may be faster to drop all the indexes and triggers beforehand with drop index (or alter table, for indexes created as a unique constraint) and drop trigger; set the database option; copy the data into the table; re-create the indexes and triggers; and then dump the database. Remember to allocate disk space for the construction of indexes and triggers: about 2.2 times the amount of space needed for the data.

Configuring databases for fast bcp

To allow a user to copy in data using fast bcp, either a System Administrator or the Database Owner first must use sp_dboption to set select into/bulkcopy/pllsort to true on the database that contains the target table or tables. If the option is set to false when a user tries to use fast bcp to copy data into a table without indexes or triggers, Adaptive Server generates an error message.

You do not need to set the select into/bulkcopy/pllsort option to true to copy out data from, or to copy in data to a table that has indexes or triggers. Slow bcp always copies tables with indexes or triggers and logs all inserts.

By default, the select into/bulkcopy/pllsort option is set to false (off) in newly created databases. To change the default setting for future databases, turn this option on (set to true) in the model database.

Dropping indexes and triggers

If you are copying a very large number of rows, you must have 1.2 times the amount of space needed for the data and enough space for the server to reconstruct a clustered index.

Copying in data with fast bcp

Table 3-2 summarizes the steps for copying in data to Adaptive Server using fast bcp.

Steps for copying in data using fast bcp

Step

Who can do it

Use sp_dboption to set select into/bulkcopy/pllsort to true.

Run checkpoint in the database that was changed.

System Administrator or Database Owner

Have enough space to re-create any indexes and triggers on the table.

Drop the indexes and triggers on the table.

Table owner

Have insert permission on the table.

Granted by the table owner

Perform the copy with bcp.

Any user with insert permission

Re-create the indexes and triggers.

Table owner

Reset sp_dboption, if desired, and run checkpoint in the database that was changed.

System Administrator or Database Owner

Use dump database to back up the newly inserted data.

System Administrator, Operator, or Database Owner

Run stored procedures or queries to determine whether any of the newly loaded data violates rules.

Table owner or stored procedure owner

Bulk copying data into partitioned tables

In certain circumstances, you can improve bcp performance dramatically by executing several bcp sessions with a partitioned table.

Partitioned tables improve insert performance by reducing lock contention and by distributing I/O over multiple devices. bcp performance with partitioned tables is improved primarily because of this distributed I/O.

When you execute a bcp session on a partitioned table, consider:

To copy data into a partitioned heap table, you can either:

To copy data randomly into partitions

To copy data randomly into partitioned tables when using multiple bcp sessions, you must:

  1. Configure the table with as many partitions and physical devices as you require for your system.

    For more information, see the Performance and Tuning Guide, and "Using parallel bulk copy to copy data into a specific partition" of this manual.

  2. Make sure Adaptive Server is configured with enough locks to support multiple bcp sessions. For information on configuring locks, see the System Administration Guide.

  3. Remove the triggers and indexes on the table and enable fast bcp. See "Using fast or slow bcp" for instructions.

    If you use slow bcp, performance may not improve significantly after you remove the triggers and indexes. Also, if the table contains indexes, you may experience deadlocks on the index pages.

  4. Divide the bcp input file into as many files of equal size as the number of planned simultaneous bcp sessions.

    You also can use the -F first_row and -L last_row options to specify the start and end of each "input file."

  5. Execute the bcp sessions with separate files in parallel on the local Adaptive Server machine.

    For example, on UNIX platforms, you can execute different sessions in different shell windows or start individual bcp sessions in the background.

Read the Performance and Tuning Guide for a detailed description of copying data into partitioned tables.

Monitoring bcp sessions with dbcc checktable and sp_helpsegment

If you do not specify which partition the bcp sessions should use, Adaptive Server randomly assigns the multiple bcp sessions to the table's available partitions. If this random assignment occurs, be sure to monitor the partitions to ensure that the process has evenly distributed the inserts by using either of the following:

For more information about dbcc checktable, see the System Administration Guide. For more information about sp_helpsegment and sp_helpartition, see the Reference Manual.

For more information about table partitions, see the Performance and Tuning Guide.

Reducing logging by increasing page allocations

If you are using fast bcp, consider that each bcp in batch requires the page manager to allocate one or more extents. Each such allocation generates a single log record.

Use the number of preallocated extents configuration parameter to specify how many extents Adaptive Server is to allocate through the page manager.

Adaptive Server may allocate more pages than are actually needed, so keep the value small when space is limited. These pages are deallocated at the end of the batch.

For more information, see the System Administration Guide.

Using parallel bulk copy to copy data into a specific partition

Use parallel bulk copy to copy data in parallel to a specific partition. Parallel bulk copy substantially increases performance during bcp sessions because it can split large bulk copy jobs into multiple sessions and run the sessions concurrently.

To use parallel bulk copy:

Figure 3-1 illustrates the parallel bulk copy process.

Figure 3-1: Copying data into a partitioned table using parallel bulk copyraster

See the Adaptive Server Enterprise Performance and Tuning Guide for information about partitioning a table.

When using parallel bulk copy to copy data out, you cannot specify which partitions bcp should use.

bcp in and locks

When you copy in to a table using bcp, and particularly when you copy in to a table using parallel bcp, the copy process acquires the following locks:

If you are copying in very large tables, and especially if you are using simultaneous copies into a partitioned table, this can involve a very large number of locks.

To avoid running out of locks:

Parallel bulk copy methods

Use one of the following methods to copy in data using parallel bulk copy:

Parallel bulk copy syntax

The syntax for parallel bulk copy is:

bcp table_name[:partition_number] in file_name -Pmypassword 

where:

Using parallel bulk copy on partitioned tables

To copy sorted data in parallel into a specific partition:

bcp copies each file or set of line numbers to a separate partition. For example, to use parallel bulk copy to copy in sorted data to mydb..bigtable from four files into four partitions, enter:

bcp mydb..bigtable:1 in file1 -Pmypassword -c &
bcp mydb..bigtable:2 in file2 -Pmypassword -c &
bcp mydb..bigtable:3 in file3 -Pmypassword -c &
bcp mydb..bigtable:4 in file4 -Pmypassword -c &

Parallel bulk copy and IDENTITY columns

When you are using parallel bulk copy, IDENTITY columns can cause a bottleneck. As bcp reads in the data, the utility both generates the values of the IDENTITY column and updates the IDENTITY column's maximum value for each row. This extra work may adversely affect the performance improvement that you expected to receive from using parallel bulk copy.

To avoid this bottleneck, you can explicitly specify the IDENTITY starting point for each session.

Retaining sort order

If you copy sorted data into the table without explicitly specifying the IDENTITY starting point, bcp might not generate the IDENTITY column values in sorted order. Parallel bulk copy reads the information into all the partitions simultaneously and updates the values of the IDENTITY column as it reads in the data.

A bcp statement with no explicit starting point would produce IDENTITY column numbers similar to those shown in Figure 3-2:

Figure 3-2: Producing IDENTITY columns in sorted orderraster

The table has a maximum IDENTITY column number of 119, but the order is no longer meaningful.

If you want Adaptive Server to enforce unique IDENTITY column values, you must run bcp with either the -g or -E parameter.

Specifying the starting point from the command line

Use the -g id_start_value flag to specify an IDENTITY starting point for a session in the command line.

The -g parameter instructs Adaptive Server to generate a sequence of IDENTITY column values for the bcp session without checking and updating the maximum value of the table's IDENTITY column for each row. Instead of checking, Adaptive Server updates the maximum value at the end of each batch.

Warning!

Be cautious about creating duplicate identity values inadvertently when you specify identity value ranges that overlap.

To specify a starting IDENTITY value, enter:

bcp [-gid_start_value]

For example, to copy in four files, each of which has 100 rows, enter:

bcp mydb..bigtable in file1 -g100
bcp mydb..bigtable in file2 -g200
bcp mydb..bigtable in file3 -g300
bcp mydb..bigtable in file4 -g400

Using the -g parameter does not guarantee that the IDENTITY column values are unique. To ensure uniqueness, you must:

Specifying the starting point using the data file

Use the -E parameter to set the IDENTITY starting point explicitly from the data file.

The -E parameter instructs bcp to prompt you to enter an explicit IDENTITY column value for each row. If the number of inserted rows exceeds the maximum possible IDENTITY column value, Adaptive Server returns an error.


bcp modes [Table of Contents] Using the bcp options