"Fossies" - the Fresh Open Source Software Archive

Member "git-fast-import.txt" (15 Dec 2018, 55910 Bytes) of package /linux/misc/git-htmldocs-2.20.1.tar.xz:


As a special service "Fossies" has tried to format the requested text file into HTML format (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 git-fast-import(1)
    2 ==================
    3 
    4 NAME
    5 ----
    6 git-fast-import - Backend for fast Git data importers
    7 
    8 
    9 SYNOPSIS
   10 --------
   11 [verse]
   12 frontend | 'git fast-import' [<options>]
   13 
   14 DESCRIPTION
   15 -----------
   16 This program is usually not what the end user wants to run directly.
   17 Most end users want to use one of the existing frontend programs,
   18 which parses a specific type of foreign source and feeds the contents
   19 stored there to 'git fast-import'.
   20 
   21 fast-import reads a mixed command/data stream from standard input and
   22 writes one or more packfiles directly into the current repository.
   23 When EOF is received on standard input, fast import writes out
   24 updated branch and tag refs, fully updating the current repository
   25 with the newly imported data.
   26 
   27 The fast-import backend itself can import into an empty repository (one that
   28 has already been initialized by 'git init') or incrementally
   29 update an existing populated repository.  Whether or not incremental
   30 imports are supported from a particular foreign source depends on
   31 the frontend program in use.
   32 
   33 
   34 OPTIONS
   35 -------
   36 
   37 --force::
   38 	Force updating modified existing branches, even if doing
   39 	so would cause commits to be lost (as the new commit does
   40 	not contain the old commit).
   41 
   42 --quiet::
   43 	Disable all non-fatal output, making fast-import silent when it
   44 	is successful.  This option disables the output shown by
   45 	--stats.
   46 
   47 --stats::
   48 	Display some basic statistics about the objects fast-import has
   49 	created, the packfiles they were stored into, and the
   50 	memory used by fast-import during this run.  Showing this output
   51 	is currently the default, but can be disabled with --quiet.
   52 
   53 Options for Frontends
   54 ~~~~~~~~~~~~~~~~~~~~~
   55 
   56 --cat-blob-fd=<fd>::
   57 	Write responses to `get-mark`, `cat-blob`, and `ls` queries to the
   58 	file descriptor <fd> instead of `stdout`.  Allows `progress`
   59 	output intended for the end-user to be separated from other
   60 	output.
   61 
   62 --date-format=<fmt>::
   63 	Specify the type of dates the frontend will supply to
   64 	fast-import within `author`, `committer` and `tagger` commands.
   65 	See ``Date Formats'' below for details about which formats
   66 	are supported, and their syntax.
   67 
   68 --done::
   69 	Terminate with error if there is no `done` command at the end of
   70 	the stream.  This option might be useful for detecting errors
   71 	that cause the frontend to terminate before it has started to
   72 	write a stream.
   73 
   74 Locations of Marks Files
   75 ~~~~~~~~~~~~~~~~~~~~~~~~
   76 
   77 --export-marks=<file>::
   78 	Dumps the internal marks table to <file> when complete.
   79 	Marks are written one per line as `:markid SHA-1`.
   80 	Frontends can use this file to validate imports after they
   81 	have been completed, or to save the marks table across
   82 	incremental runs.  As <file> is only opened and truncated
   83 	at checkpoint (or completion) the same path can also be
   84 	safely given to --import-marks.
   85 
   86 --import-marks=<file>::
   87 	Before processing any input, load the marks specified in
   88 	<file>.  The input file must exist, must be readable, and
   89 	must use the same format as produced by --export-marks.
   90 	Multiple options may be supplied to import more than one
   91 	set of marks.  If a mark is defined to different values,
   92 	the last file wins.
   93 
   94 --import-marks-if-exists=<file>::
   95 	Like --import-marks but instead of erroring out, silently
   96 	skips the file if it does not exist.
   97 
   98 --[no-]relative-marks::
   99 	After specifying --relative-marks the paths specified
  100 	with --import-marks= and --export-marks= are relative
  101 	to an internal directory in the current repository.
  102 	In git-fast-import this means that the paths are relative
  103 	to the .git/info/fast-import directory. However, other
  104 	importers may use a different location.
  105 +
  106 Relative and non-relative marks may be combined by interweaving
  107 --(no-)-relative-marks with the --(import|export)-marks= options.
  108 
  109 Performance and Compression Tuning
  110 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  111 
  112 --active-branches=<n>::
  113 	Maximum number of branches to maintain active at once.
  114 	See ``Memory Utilization'' below for details.  Default is 5.
  115 
  116 --big-file-threshold=<n>::
  117 	Maximum size of a blob that fast-import will attempt to
  118 	create a delta for, expressed in bytes.  The default is 512m
  119 	(512 MiB).  Some importers may wish to lower this on systems
  120 	with constrained memory.
  121 
  122 --depth=<n>::
  123 	Maximum delta depth, for blob and tree deltification.
  124 	Default is 50.
  125 
  126 --export-pack-edges=<file>::
  127 	After creating a packfile, print a line of data to
  128 	<file> listing the filename of the packfile and the last
  129 	commit on each branch that was written to that packfile.
  130 	This information may be useful after importing projects
  131 	whose total object set exceeds the 4 GiB packfile limit,
  132 	as these commits can be used as edge points during calls
  133 	to 'git pack-objects'.
  134 
  135 --max-pack-size=<n>::
  136 	Maximum size of each output packfile.
  137 	The default is unlimited.
  138 
  139 fastimport.unpackLimit::
  140 	See linkgit:git-config[1]
  141 
  142 PERFORMANCE
  143 -----------
  144 The design of fast-import allows it to import large projects in a minimum
  145 amount of memory usage and processing time.  Assuming the frontend
  146 is able to keep up with fast-import and feed it a constant stream of data,
  147 import times for projects holding 10+ years of history and containing
  148 100,000+ individual commits are generally completed in just 1-2
  149 hours on quite modest (~$2,000 USD) hardware.
  150 
  151 Most bottlenecks appear to be in foreign source data access (the
  152 source just cannot extract revisions fast enough) or disk IO (fast-import
  153 writes as fast as the disk will take the data).  Imports will run
  154 faster if the source data is stored on a different drive than the
  155 destination Git repository (due to less IO contention).
  156 
  157 
  158 DEVELOPMENT COST
  159 ----------------
  160 A typical frontend for fast-import tends to weigh in at approximately 200
  161 lines of Perl/Python/Ruby code.  Most developers have been able to
  162 create working importers in just a couple of hours, even though it
  163 is their first exposure to fast-import, and sometimes even to Git.  This is
  164 an ideal situation, given that most conversion tools are throw-away
  165 (use once, and never look back).
  166 
  167 
  168 PARALLEL OPERATION
  169 ------------------
  170 Like 'git push' or 'git fetch', imports handled by fast-import are safe to
  171 run alongside parallel `git repack -a -d` or `git gc` invocations,
  172 or any other Git operation (including 'git prune', as loose objects
  173 are never used by fast-import).
  174 
  175 fast-import does not lock the branch or tag refs it is actively importing.
  176 After the import, during its ref update phase, fast-import tests each
  177 existing branch ref to verify the update will be a fast-forward
  178 update (the commit stored in the ref is contained in the new
  179 history of the commit to be written).  If the update is not a
  180 fast-forward update, fast-import will skip updating that ref and instead
  181 prints a warning message.  fast-import will always attempt to update all
  182 branch refs, and does not stop on the first failure.
  183 
  184 Branch updates can be forced with --force, but it's recommended that
  185 this only be used on an otherwise quiet repository.  Using --force
  186 is not necessary for an initial import into an empty repository.
  187 
  188 
  189 TECHNICAL DISCUSSION
  190 --------------------
  191 fast-import tracks a set of branches in memory.  Any branch can be created
  192 or modified at any point during the import process by sending a
  193 `commit` command on the input stream.  This design allows a frontend
  194 program to process an unlimited number of branches simultaneously,
  195 generating commits in the order they are available from the source
  196 data.  It also simplifies the frontend programs considerably.
  197 
  198 fast-import does not use or alter the current working directory, or any
  199 file within it.  (It does however update the current Git repository,
  200 as referenced by `GIT_DIR`.)  Therefore an import frontend may use
  201 the working directory for its own purposes, such as extracting file
  202 revisions from the foreign source.  This ignorance of the working
  203 directory also allows fast-import to run very quickly, as it does not
  204 need to perform any costly file update operations when switching
  205 between branches.
  206 
  207 INPUT FORMAT
  208 ------------
  209 With the exception of raw file data (which Git does not interpret)
  210 the fast-import input format is text (ASCII) based.  This text based
  211 format simplifies development and debugging of frontend programs,
  212 especially when a higher level language such as Perl, Python or
  213 Ruby is being used.
  214 
  215 fast-import is very strict about its input.  Where we say SP below we mean
  216 *exactly* one space.  Likewise LF means one (and only one) linefeed
  217 and HT one (and only one) horizontal tab.
  218 Supplying additional whitespace characters will cause unexpected
  219 results, such as branch names or file names with leading or trailing
  220 spaces in their name, or early termination of fast-import when it encounters
  221 unexpected input.
  222 
  223 Stream Comments
  224 ~~~~~~~~~~~~~~~
  225 To aid in debugging frontends fast-import ignores any line that
  226 begins with `#` (ASCII pound/hash) up to and including the line
  227 ending `LF`.  A comment line may contain any sequence of bytes
  228 that does not contain an LF and therefore may be used to include
  229 any detailed debugging information that might be specific to the
  230 frontend and useful when inspecting a fast-import data stream.
  231 
  232 Date Formats
  233 ~~~~~~~~~~~~
  234 The following date formats are supported.  A frontend should select
  235 the format it will use for this import by passing the format name
  236 in the --date-format=<fmt> command-line option.
  237 
  238 `raw`::
  239 	This is the Git native format and is `<time> SP <offutc>`.
  240 	It is also fast-import's default format, if --date-format was
  241 	not specified.
  242 +
  243 The time of the event is specified by `<time>` as the number of
  244 seconds since the UNIX epoch (midnight, Jan 1, 1970, UTC) and is
  245 written as an ASCII decimal integer.
  246 +
  247 The local offset is specified by `<offutc>` as a positive or negative
  248 offset from UTC.  For example EST (which is 5 hours behind UTC)
  249 would be expressed in `<tz>` by ``-0500'' while UTC is ``+0000''.
  250 The local offset does not affect `<time>`; it is used only as an
  251 advisement to help formatting routines display the timestamp.
  252 +
  253 If the local offset is not available in the source material, use
  254 ``+0000'', or the most common local offset.  For example many
  255 organizations have a CVS repository which has only ever been accessed
  256 by users who are located in the same location and time zone.  In this
  257 case a reasonable offset from UTC could be assumed.
  258 +
  259 Unlike the `rfc2822` format, this format is very strict.  Any
  260 variation in formatting will cause fast-import to reject the value.
  261 
  262 `rfc2822`::
  263 	This is the standard email format as described by RFC 2822.
  264 +
  265 An example value is ``Tue Feb 6 11:22:18 2007 -0500''.  The Git
  266 parser is accurate, but a little on the lenient side.  It is the
  267 same parser used by 'git am' when applying patches
  268 received from email.
  269 +
  270 Some malformed strings may be accepted as valid dates.  In some of
  271 these cases Git will still be able to obtain the correct date from
  272 the malformed string.  There are also some types of malformed
  273 strings which Git will parse wrong, and yet consider valid.
  274 Seriously malformed strings will be rejected.
  275 +
  276 Unlike the `raw` format above, the time zone/UTC offset information
  277 contained in an RFC 2822 date string is used to adjust the date
  278 value to UTC prior to storage.  Therefore it is important that
  279 this information be as accurate as possible.
  280 +
  281 If the source material uses RFC 2822 style dates,
  282 the frontend should let fast-import handle the parsing and conversion
  283 (rather than attempting to do it itself) as the Git parser has
  284 been well tested in the wild.
  285 +
  286 Frontends should prefer the `raw` format if the source material
  287 already uses UNIX-epoch format, can be coaxed to give dates in that
  288 format, or its format is easily convertible to it, as there is no
  289 ambiguity in parsing.
  290 
  291 `now`::
  292 	Always use the current time and time zone.  The literal
  293 	`now` must always be supplied for `<when>`.
  294 +
  295 This is a toy format.  The current time and time zone of this system
  296 is always copied into the identity string at the time it is being
  297 created by fast-import.  There is no way to specify a different time or
  298 time zone.
  299 +
  300 This particular format is supplied as it's short to implement and
  301 may be useful to a process that wants to create a new commit
  302 right now, without needing to use a working directory or
  303 'git update-index'.
  304 +
  305 If separate `author` and `committer` commands are used in a `commit`
  306 the timestamps may not match, as the system clock will be polled
  307 twice (once for each command).  The only way to ensure that both
  308 author and committer identity information has the same timestamp
  309 is to omit `author` (thus copying from `committer`) or to use a
  310 date format other than `now`.
  311 
  312 Commands
  313 ~~~~~~~~
  314 fast-import accepts several commands to update the current repository
  315 and control the current import process.  More detailed discussion
  316 (with examples) of each command follows later.
  317 
  318 `commit`::
  319 	Creates a new branch or updates an existing branch by
  320 	creating a new commit and updating the branch to point at
  321 	the newly created commit.
  322 
  323 `tag`::
  324 	Creates an annotated tag object from an existing commit or
  325 	branch.  Lightweight tags are not supported by this command,
  326 	as they are not recommended for recording meaningful points
  327 	in time.
  328 
  329 `reset`::
  330 	Reset an existing branch (or a new branch) to a specific
  331 	revision.  This command must be used to change a branch to
  332 	a specific revision without making a commit on it.
  333 
  334 `blob`::
  335 	Convert raw file data into a blob, for future use in a
  336 	`commit` command.  This command is optional and is not
  337 	needed to perform an import.
  338 
  339 `checkpoint`::
  340 	Forces fast-import to close the current packfile, generate its
  341 	unique SHA-1 checksum and index, and start a new packfile.
  342 	This command is optional and is not needed to perform
  343 	an import.
  344 
  345 `progress`::
  346 	Causes fast-import to echo the entire line to its own
  347 	standard output.  This command is optional and is not needed
  348 	to perform an import.
  349 
  350 `done`::
  351 	Marks the end of the stream. This command is optional
  352 	unless the `done` feature was requested using the
  353 	`--done` command-line option or `feature done` command.
  354 
  355 `get-mark`::
  356 	Causes fast-import to print the SHA-1 corresponding to a mark
  357 	to the file descriptor set with `--cat-blob-fd`, or `stdout` if
  358 	unspecified.
  359 
  360 `cat-blob`::
  361 	Causes fast-import to print a blob in 'cat-file --batch'
  362 	format to the file descriptor set with `--cat-blob-fd` or
  363 	`stdout` if unspecified.
  364 
  365 `ls`::
  366 	Causes fast-import to print a line describing a directory
  367 	entry in 'ls-tree' format to the file descriptor set with
  368 	`--cat-blob-fd` or `stdout` if unspecified.
  369 
  370 `feature`::
  371 	Enable the specified feature. This requires that fast-import
  372 	supports the specified feature, and aborts if it does not.
  373 
  374 `option`::
  375 	Specify any of the options listed under OPTIONS that do not
  376 	change stream semantic to suit the frontend's needs. This
  377 	command is optional and is not needed to perform an import.
  378 
  379 `commit`
  380 ~~~~~~~~
  381 Create or update a branch with a new commit, recording one logical
  382 change to the project.
  383 
  384 ....
  385 	'commit' SP <ref> LF
  386 	mark?
  387 	('author' (SP <name>)? SP LT <email> GT SP <when> LF)?
  388 	'committer' (SP <name>)? SP LT <email> GT SP <when> LF
  389 	data
  390 	('from' SP <commit-ish> LF)?
  391 	('merge' SP <commit-ish> LF)?
  392 	(filemodify | filedelete | filecopy | filerename | filedeleteall | notemodify)*
  393 	LF?
  394 ....
  395 
  396 where `<ref>` is the name of the branch to make the commit on.
  397 Typically branch names are prefixed with `refs/heads/` in
  398 Git, so importing the CVS branch symbol `RELENG-1_0` would use
  399 `refs/heads/RELENG-1_0` for the value of `<ref>`.  The value of
  400 `<ref>` must be a valid refname in Git.  As `LF` is not valid in
  401 a Git refname, no quoting or escaping syntax is supported here.
  402 
  403 A `mark` command may optionally appear, requesting fast-import to save a
  404 reference to the newly created commit for future use by the frontend
  405 (see below for format).  It is very common for frontends to mark
  406 every commit they create, thereby allowing future branch creation
  407 from any imported commit.
  408 
  409 The `data` command following `committer` must supply the commit
  410 message (see below for `data` command syntax).  To import an empty
  411 commit message use a 0 length data.  Commit messages are free-form
  412 and are not interpreted by Git.  Currently they must be encoded in
  413 UTF-8, as fast-import does not permit other encodings to be specified.
  414 
  415 Zero or more `filemodify`, `filedelete`, `filecopy`, `filerename`,
  416 `filedeleteall` and `notemodify` commands
  417 may be included to update the contents of the branch prior to
  418 creating the commit.  These commands may be supplied in any order.
  419 However it is recommended that a `filedeleteall` command precede
  420 all `filemodify`, `filecopy`, `filerename` and `notemodify` commands in
  421 the same commit, as `filedeleteall` wipes the branch clean (see below).
  422 
  423 The `LF` after the command is optional (it used to be required).
  424 
  425 `author`
  426 ^^^^^^^^
  427 An `author` command may optionally appear, if the author information
  428 might differ from the committer information.  If `author` is omitted
  429 then fast-import will automatically use the committer's information for
  430 the author portion of the commit.  See below for a description of
  431 the fields in `author`, as they are identical to `committer`.
  432 
  433 `committer`
  434 ^^^^^^^^^^^
  435 The `committer` command indicates who made this commit, and when
  436 they made it.
  437 
  438 Here `<name>` is the person's display name (for example
  439 ``Com M Itter'') and `<email>` is the person's email address
  440 (``\cm@example.com'').  `LT` and `GT` are the literal less-than (\x3c)
  441 and greater-than (\x3e) symbols.  These are required to delimit
  442 the email address from the other fields in the line.  Note that
  443 `<name>` and `<email>` are free-form and may contain any sequence
  444 of bytes, except `LT`, `GT` and `LF`.  `<name>` is typically UTF-8 encoded.
  445 
  446 The time of the change is specified by `<when>` using the date format
  447 that was selected by the --date-format=<fmt> command-line option.
  448 See ``Date Formats'' above for the set of supported formats, and
  449 their syntax.
  450 
  451 `from`
  452 ^^^^^^
  453 The `from` command is used to specify the commit to initialize
  454 this branch from.  This revision will be the first ancestor of the
  455 new commit.  The state of the tree built at this commit will begin
  456 with the state at the `from` commit, and be altered by the content
  457 modifications in this commit.
  458 
  459 Omitting the `from` command in the first commit of a new branch
  460 will cause fast-import to create that commit with no ancestor. This
  461 tends to be desired only for the initial commit of a project.
  462 If the frontend creates all files from scratch when making a new
  463 branch, a `merge` command may be used instead of `from` to start
  464 the commit with an empty tree.
  465 Omitting the `from` command on existing branches is usually desired,
  466 as the current commit on that branch is automatically assumed to
  467 be the first ancestor of the new commit.
  468 
  469 As `LF` is not valid in a Git refname or SHA-1 expression, no
  470 quoting or escaping syntax is supported within `<commit-ish>`.
  471 
  472 Here `<commit-ish>` is any of the following:
  473 
  474 * The name of an existing branch already in fast-import's internal branch
  475   table.  If fast-import doesn't know the name, it's treated as a SHA-1
  476   expression.
  477 
  478 * A mark reference, `:<idnum>`, where `<idnum>` is the mark number.
  479 +
  480 The reason fast-import uses `:` to denote a mark reference is this character
  481 is not legal in a Git branch name.  The leading `:` makes it easy
  482 to distinguish between the mark 42 (`:42`) and the branch 42 (`42`
  483 or `refs/heads/42`), or an abbreviated SHA-1 which happened to
  484 consist only of base-10 digits.
  485 +
  486 Marks must be declared (via `mark`) before they can be used.
  487 
  488 * A complete 40 byte or abbreviated commit SHA-1 in hex.
  489 
  490 * Any valid Git SHA-1 expression that resolves to a commit.  See
  491   ``SPECIFYING REVISIONS'' in linkgit:gitrevisions[7] for details.
  492 
  493 * The special null SHA-1 (40 zeros) specifies that the branch is to be
  494   removed.
  495 
  496 The special case of restarting an incremental import from the
  497 current branch value should be written as:
  498 ----
  499 	from refs/heads/branch^0
  500 ----
  501 The `^0` suffix is necessary as fast-import does not permit a branch to
  502 start from itself, and the branch is created in memory before the
  503 `from` command is even read from the input.  Adding `^0` will force
  504 fast-import to resolve the commit through Git's revision parsing library,
  505 rather than its internal branch table, thereby loading in the
  506 existing value of the branch.
  507 
  508 `merge`
  509 ^^^^^^^
  510 Includes one additional ancestor commit.  The additional ancestry
  511 link does not change the way the tree state is built at this commit.
  512 If the `from` command is
  513 omitted when creating a new branch, the first `merge` commit will be
  514 the first ancestor of the current commit, and the branch will start
  515 out with no files.  An unlimited number of `merge` commands per
  516 commit are permitted by fast-import, thereby establishing an n-way merge.
  517 
  518 Here `<commit-ish>` is any of the commit specification expressions
  519 also accepted by `from` (see above).
  520 
  521 `filemodify`
  522 ^^^^^^^^^^^^
  523 Included in a `commit` command to add a new file or change the
  524 content of an existing file.  This command has two different means
  525 of specifying the content of the file.
  526 
  527 External data format::
  528 	The data content for the file was already supplied by a prior
  529 	`blob` command.  The frontend just needs to connect it.
  530 +
  531 ....
  532 	'M' SP <mode> SP <dataref> SP <path> LF
  533 ....
  534 +
  535 Here usually `<dataref>` must be either a mark reference (`:<idnum>`)
  536 set by a prior `blob` command, or a full 40-byte SHA-1 of an
  537 existing Git blob object.  If `<mode>` is `040000`` then
  538 `<dataref>` must be the full 40-byte SHA-1 of an existing
  539 Git tree object or a mark reference set with `--import-marks`.
  540 
  541 Inline data format::
  542 	The data content for the file has not been supplied yet.
  543 	The frontend wants to supply it as part of this modify
  544 	command.
  545 +
  546 ....
  547 	'M' SP <mode> SP 'inline' SP <path> LF
  548 	data
  549 ....
  550 +
  551 See below for a detailed description of the `data` command.
  552 
  553 In both formats `<mode>` is the type of file entry, specified
  554 in octal.  Git only supports the following modes:
  555 
  556 * `100644` or `644`: A normal (not-executable) file.  The majority
  557   of files in most projects use this mode.  If in doubt, this is
  558   what you want.
  559 * `100755` or `755`: A normal, but executable, file.
  560 * `120000`: A symlink, the content of the file will be the link target.
  561 * `160000`: A gitlink, SHA-1 of the object refers to a commit in
  562   another repository. Git links can only be specified by SHA or through
  563   a commit mark. They are used to implement submodules.
  564 * `040000`: A subdirectory.  Subdirectories can only be specified by
  565   SHA or through a tree mark set with `--import-marks`.
  566 
  567 In both formats `<path>` is the complete path of the file to be added
  568 (if not already existing) or modified (if already existing).
  569 
  570 A `<path>` string must use UNIX-style directory separators (forward
  571 slash `/`), may contain any byte other than `LF`, and must not
  572 start with double quote (`"`).
  573 
  574 A path can use C-style string quoting; this is accepted in all cases
  575 and mandatory if the filename starts with double quote or contains
  576 `LF`. In C-style quoting, the complete name should be surrounded with
  577 double quotes, and any `LF`, backslash, or double quote characters
  578 must be escaped by preceding them with a backslash (e.g.,
  579 `"path/with\n, \\ and \" in it"`).
  580 
  581 The value of `<path>` must be in canonical form. That is it must not:
  582 
  583 * contain an empty directory component (e.g. `foo//bar` is invalid),
  584 * end with a directory separator (e.g. `foo/` is invalid),
  585 * start with a directory separator (e.g. `/foo` is invalid),
  586 * contain the special component `.` or `..` (e.g. `foo/./bar` and
  587   `foo/../bar` are invalid).
  588 
  589 The root of the tree can be represented by an empty string as `<path>`.
  590 
  591 It is recommended that `<path>` always be encoded using UTF-8.
  592 
  593 `filedelete`
  594 ^^^^^^^^^^^^
  595 Included in a `commit` command to remove a file or recursively
  596 delete an entire directory from the branch.  If the file or directory
  597 removal makes its parent directory empty, the parent directory will
  598 be automatically removed too.  This cascades up the tree until the
  599 first non-empty directory or the root is reached.
  600 
  601 ....
  602 	'D' SP <path> LF
  603 ....
  604 
  605 here `<path>` is the complete path of the file or subdirectory to
  606 be removed from the branch.
  607 See `filemodify` above for a detailed description of `<path>`.
  608 
  609 `filecopy`
  610 ^^^^^^^^^^
  611 Recursively copies an existing file or subdirectory to a different
  612 location within the branch.  The existing file or directory must
  613 exist.  If the destination exists it will be completely replaced
  614 by the content copied from the source.
  615 
  616 ....
  617 	'C' SP <path> SP <path> LF
  618 ....
  619 
  620 here the first `<path>` is the source location and the second
  621 `<path>` is the destination.  See `filemodify` above for a detailed
  622 description of what `<path>` may look like.  To use a source path
  623 that contains SP the path must be quoted.
  624 
  625 A `filecopy` command takes effect immediately.  Once the source
  626 location has been copied to the destination any future commands
  627 applied to the source location will not impact the destination of
  628 the copy.
  629 
  630 `filerename`
  631 ^^^^^^^^^^^^
  632 Renames an existing file or subdirectory to a different location
  633 within the branch.  The existing file or directory must exist. If
  634 the destination exists it will be replaced by the source directory.
  635 
  636 ....
  637 	'R' SP <path> SP <path> LF
  638 ....
  639 
  640 here the first `<path>` is the source location and the second
  641 `<path>` is the destination.  See `filemodify` above for a detailed
  642 description of what `<path>` may look like.  To use a source path
  643 that contains SP the path must be quoted.
  644 
  645 A `filerename` command takes effect immediately.  Once the source
  646 location has been renamed to the destination any future commands
  647 applied to the source location will create new files there and not
  648 impact the destination of the rename.
  649 
  650 Note that a `filerename` is the same as a `filecopy` followed by a
  651 `filedelete` of the source location.  There is a slight performance
  652 advantage to using `filerename`, but the advantage is so small
  653 that it is never worth trying to convert a delete/add pair in
  654 source material into a rename for fast-import.  This `filerename`
  655 command is provided just to simplify frontends that already have
  656 rename information and don't want bother with decomposing it into a
  657 `filecopy` followed by a `filedelete`.
  658 
  659 `filedeleteall`
  660 ^^^^^^^^^^^^^^^
  661 Included in a `commit` command to remove all files (and also all
  662 directories) from the branch.  This command resets the internal
  663 branch structure to have no files in it, allowing the frontend
  664 to subsequently add all interesting files from scratch.
  665 
  666 ....
  667 	'deleteall' LF
  668 ....
  669 
  670 This command is extremely useful if the frontend does not know
  671 (or does not care to know) what files are currently on the branch,
  672 and therefore cannot generate the proper `filedelete` commands to
  673 update the content.
  674 
  675 Issuing a `filedeleteall` followed by the needed `filemodify`
  676 commands to set the correct content will produce the same results
  677 as sending only the needed `filemodify` and `filedelete` commands.
  678 The `filedeleteall` approach may however require fast-import to use slightly
  679 more memory per active branch (less than 1 MiB for even most large
  680 projects); so frontends that can easily obtain only the affected
  681 paths for a commit are encouraged to do so.
  682 
  683 `notemodify`
  684 ^^^^^^^^^^^^
  685 Included in a `commit` `<notes_ref>` command to add a new note
  686 annotating a `<commit-ish>` or change this annotation contents.
  687 Internally it is similar to filemodify 100644 on `<commit-ish>`
  688 path (maybe split into subdirectories). It's not advised to
  689 use any other commands to write to the `<notes_ref>` tree except
  690 `filedeleteall` to delete all existing notes in this tree.
  691 This command has two different means of specifying the content
  692 of the note.
  693 
  694 External data format::
  695 	The data content for the note was already supplied by a prior
  696 	`blob` command.  The frontend just needs to connect it to the
  697 	commit that is to be annotated.
  698 +
  699 ....
  700 	'N' SP <dataref> SP <commit-ish> LF
  701 ....
  702 +
  703 Here `<dataref>` can be either a mark reference (`:<idnum>`)
  704 set by a prior `blob` command, or a full 40-byte SHA-1 of an
  705 existing Git blob object.
  706 
  707 Inline data format::
  708 	The data content for the note has not been supplied yet.
  709 	The frontend wants to supply it as part of this modify
  710 	command.
  711 +
  712 ....
  713 	'N' SP 'inline' SP <commit-ish> LF
  714 	data
  715 ....
  716 +
  717 See below for a detailed description of the `data` command.
  718 
  719 In both formats `<commit-ish>` is any of the commit specification
  720 expressions also accepted by `from` (see above).
  721 
  722 `mark`
  723 ~~~~~~
  724 Arranges for fast-import to save a reference to the current object, allowing
  725 the frontend to recall this object at a future point in time, without
  726 knowing its SHA-1.  Here the current object is the object creation
  727 command the `mark` command appears within.  This can be `commit`,
  728 `tag`, and `blob`, but `commit` is the most common usage.
  729 
  730 ....
  731 	'mark' SP ':' <idnum> LF
  732 ....
  733 
  734 where `<idnum>` is the number assigned by the frontend to this mark.
  735 The value of `<idnum>` is expressed as an ASCII decimal integer.
  736 The value 0 is reserved and cannot be used as
  737 a mark.  Only values greater than or equal to 1 may be used as marks.
  738 
  739 New marks are created automatically.  Existing marks can be moved
  740 to another object simply by reusing the same `<idnum>` in another
  741 `mark` command.
  742 
  743 `tag`
  744 ~~~~~
  745 Creates an annotated tag referring to a specific commit.  To create
  746 lightweight (non-annotated) tags see the `reset` command below.
  747 
  748 ....
  749 	'tag' SP <name> LF
  750 	'from' SP <commit-ish> LF
  751 	'tagger' (SP <name>)? SP LT <email> GT SP <when> LF
  752 	data
  753 ....
  754 
  755 where `<name>` is the name of the tag to create.
  756 
  757 Tag names are automatically prefixed with `refs/tags/` when stored
  758 in Git, so importing the CVS branch symbol `RELENG-1_0-FINAL` would
  759 use just `RELENG-1_0-FINAL` for `<name>`, and fast-import will write the
  760 corresponding ref as `refs/tags/RELENG-1_0-FINAL`.
  761 
  762 The value of `<name>` must be a valid refname in Git and therefore
  763 may contain forward slashes.  As `LF` is not valid in a Git refname,
  764 no quoting or escaping syntax is supported here.
  765 
  766 The `from` command is the same as in the `commit` command; see
  767 above for details.
  768 
  769 The `tagger` command uses the same format as `committer` within
  770 `commit`; again see above for details.
  771 
  772 The `data` command following `tagger` must supply the annotated tag
  773 message (see below for `data` command syntax).  To import an empty
  774 tag message use a 0 length data.  Tag messages are free-form and are
  775 not interpreted by Git.  Currently they must be encoded in UTF-8,
  776 as fast-import does not permit other encodings to be specified.
  777 
  778 Signing annotated tags during import from within fast-import is not
  779 supported.  Trying to include your own PGP/GPG signature is not
  780 recommended, as the frontend does not (easily) have access to the
  781 complete set of bytes which normally goes into such a signature.
  782 If signing is required, create lightweight tags from within fast-import with
  783 `reset`, then create the annotated versions of those tags offline
  784 with the standard 'git tag' process.
  785 
  786 `reset`
  787 ~~~~~~~
  788 Creates (or recreates) the named branch, optionally starting from
  789 a specific revision.  The reset command allows a frontend to issue
  790 a new `from` command for an existing branch, or to create a new
  791 branch from an existing commit without creating a new commit.
  792 
  793 ....
  794 	'reset' SP <ref> LF
  795 	('from' SP <commit-ish> LF)?
  796 	LF?
  797 ....
  798 
  799 For a detailed description of `<ref>` and `<commit-ish>` see above
  800 under `commit` and `from`.
  801 
  802 The `LF` after the command is optional (it used to be required).
  803 
  804 The `reset` command can also be used to create lightweight
  805 (non-annotated) tags.  For example:
  806 
  807 ====
  808 	reset refs/tags/938
  809 	from :938
  810 ====
  811 
  812 would create the lightweight tag `refs/tags/938` referring to
  813 whatever commit mark `:938` references.
  814 
  815 `blob`
  816 ~~~~~~
  817 Requests writing one file revision to the packfile.  The revision
  818 is not connected to any commit; this connection must be formed in
  819 a subsequent `commit` command by referencing the blob through an
  820 assigned mark.
  821 
  822 ....
  823 	'blob' LF
  824 	mark?
  825 	data
  826 ....
  827 
  828 The mark command is optional here as some frontends have chosen
  829 to generate the Git SHA-1 for the blob on their own, and feed that
  830 directly to `commit`.  This is typically more work than it's worth
  831 however, as marks are inexpensive to store and easy to use.
  832 
  833 `data`
  834 ~~~~~~
  835 Supplies raw data (for use as blob/file content, commit messages, or
  836 annotated tag messages) to fast-import.  Data can be supplied using an exact
  837 byte count or delimited with a terminating line.  Real frontends
  838 intended for production-quality conversions should always use the
  839 exact byte count format, as it is more robust and performs better.
  840 The delimited format is intended primarily for testing fast-import.
  841 
  842 Comment lines appearing within the `<raw>` part of `data` commands
  843 are always taken to be part of the body of the data and are therefore
  844 never ignored by fast-import.  This makes it safe to import any
  845 file/message content whose lines might start with `#`.
  846 
  847 Exact byte count format::
  848 	The frontend must specify the number of bytes of data.
  849 +
  850 ....
  851 	'data' SP <count> LF
  852 	<raw> LF?
  853 ....
  854 +
  855 where `<count>` is the exact number of bytes appearing within
  856 `<raw>`.  The value of `<count>` is expressed as an ASCII decimal
  857 integer.  The `LF` on either side of `<raw>` is not
  858 included in `<count>` and will not be included in the imported data.
  859 +
  860 The `LF` after `<raw>` is optional (it used to be required) but
  861 recommended.  Always including it makes debugging a fast-import
  862 stream easier as the next command always starts in column 0
  863 of the next line, even if `<raw>` did not end with an `LF`.
  864 
  865 Delimited format::
  866 	A delimiter string is used to mark the end of the data.
  867 	fast-import will compute the length by searching for the delimiter.
  868 	This format is primarily useful for testing and is not
  869 	recommended for real data.
  870 +
  871 ....
  872 	'data' SP '<<' <delim> LF
  873 	<raw> LF
  874 	<delim> LF
  875 	LF?
  876 ....
  877 +
  878 where `<delim>` is the chosen delimiter string.  The string `<delim>`
  879 must not appear on a line by itself within `<raw>`, as otherwise
  880 fast-import will think the data ends earlier than it really does.  The `LF`
  881 immediately trailing `<raw>` is part of `<raw>`.  This is one of
  882 the limitations of the delimited format, it is impossible to supply
  883 a data chunk which does not have an LF as its last byte.
  884 +
  885 The `LF` after `<delim> LF` is optional (it used to be required).
  886 
  887 `checkpoint`
  888 ~~~~~~~~~~~~
  889 Forces fast-import to close the current packfile, start a new one, and to
  890 save out all current branch refs, tags and marks.
  891 
  892 ....
  893 	'checkpoint' LF
  894 	LF?
  895 ....
  896 
  897 Note that fast-import automatically switches packfiles when the current
  898 packfile reaches --max-pack-size, or 4 GiB, whichever limit is
  899 smaller.  During an automatic packfile switch fast-import does not update
  900 the branch refs, tags or marks.
  901 
  902 As a `checkpoint` can require a significant amount of CPU time and
  903 disk IO (to compute the overall pack SHA-1 checksum, generate the
  904 corresponding index file, and update the refs) it can easily take
  905 several minutes for a single `checkpoint` command to complete.
  906 
  907 Frontends may choose to issue checkpoints during extremely large
  908 and long running imports, or when they need to allow another Git
  909 process access to a branch.  However given that a 30 GiB Subversion
  910 repository can be loaded into Git through fast-import in about 3 hours,
  911 explicit checkpointing may not be necessary.
  912 
  913 The `LF` after the command is optional (it used to be required).
  914 
  915 `progress`
  916 ~~~~~~~~~~
  917 Causes fast-import to print the entire `progress` line unmodified to
  918 its standard output channel (file descriptor 1) when the command is
  919 processed from the input stream.  The command otherwise has no impact
  920 on the current import, or on any of fast-import's internal state.
  921 
  922 ....
  923 	'progress' SP <any> LF
  924 	LF?
  925 ....
  926 
  927 The `<any>` part of the command may contain any sequence of bytes
  928 that does not contain `LF`.  The `LF` after the command is optional.
  929 Callers may wish to process the output through a tool such as sed to
  930 remove the leading part of the line, for example:
  931 
  932 ====
  933 	frontend | git fast-import | sed 's/^progress //'
  934 ====
  935 
  936 Placing a `progress` command immediately after a `checkpoint` will
  937 inform the reader when the `checkpoint` has been completed and it
  938 can safely access the refs that fast-import updated.
  939 
  940 `get-mark`
  941 ~~~~~~~~~~
  942 Causes fast-import to print the SHA-1 corresponding to a mark to
  943 stdout or to the file descriptor previously arranged with the
  944 `--cat-blob-fd` argument. The command otherwise has no impact on the
  945 current import; its purpose is to retrieve SHA-1s that later commits
  946 might want to refer to in their commit messages.
  947 
  948 ....
  949 	'get-mark' SP ':' <idnum> LF
  950 ....
  951 
  952 This command can be used anywhere in the stream that comments are
  953 accepted.  In particular, the `get-mark` command can be used in the
  954 middle of a commit but not in the middle of a `data` command.
  955 
  956 See ``Responses To Commands'' below for details about how to read
  957 this output safely.
  958 
  959 `cat-blob`
  960 ~~~~~~~~~~
  961 Causes fast-import to print a blob to a file descriptor previously
  962 arranged with the `--cat-blob-fd` argument.  The command otherwise
  963 has no impact on the current import; its main purpose is to
  964 retrieve blobs that may be in fast-import's memory but not
  965 accessible from the target repository.
  966 
  967 ....
  968 	'cat-blob' SP <dataref> LF
  969 ....
  970 
  971 The `<dataref>` can be either a mark reference (`:<idnum>`)
  972 set previously or a full 40-byte SHA-1 of a Git blob, preexisting or
  973 ready to be written.
  974 
  975 Output uses the same format as `git cat-file --batch`:
  976 
  977 ====
  978 	<sha1> SP 'blob' SP <size> LF
  979 	<contents> LF
  980 ====
  981 
  982 This command can be used anywhere in the stream that comments are
  983 accepted.  In particular, the `cat-blob` command can be used in the
  984 middle of a commit but not in the middle of a `data` command.
  985 
  986 See ``Responses To Commands'' below for details about how to read
  987 this output safely.
  988 
  989 `ls`
  990 ~~~~
  991 Prints information about the object at a path to a file descriptor
  992 previously arranged with the `--cat-blob-fd` argument.  This allows
  993 printing a blob from the active commit (with `cat-blob`) or copying a
  994 blob or tree from a previous commit for use in the current one (with
  995 `filemodify`).
  996 
  997 The `ls` command can be used anywhere in the stream that comments are
  998 accepted, including the middle of a commit.
  999 
 1000 Reading from the active commit::
 1001 	This form can only be used in the middle of a `commit`.
 1002 	The path names a directory entry within fast-import's
 1003 	active commit.  The path must be quoted in this case.
 1004 +
 1005 ....
 1006 	'ls' SP <path> LF
 1007 ....
 1008 
 1009 Reading from a named tree::
 1010 	The `<dataref>` can be a mark reference (`:<idnum>`) or the
 1011 	full 40-byte SHA-1 of a Git tag, commit, or tree object,
 1012 	preexisting or waiting to be written.
 1013 	The path is relative to the top level of the tree
 1014 	named by `<dataref>`.
 1015 +
 1016 ....
 1017 	'ls' SP <dataref> SP <path> LF
 1018 ....
 1019 
 1020 See `filemodify` above for a detailed description of `<path>`.
 1021 
 1022 Output uses the same format as `git ls-tree <tree> -- <path>`:
 1023 
 1024 ====
 1025 	<mode> SP ('blob' | 'tree' | 'commit') SP <dataref> HT <path> LF
 1026 ====
 1027 
 1028 The <dataref> represents the blob, tree, or commit object at <path>
 1029 and can be used in later 'get-mark', 'cat-blob', 'filemodify', or
 1030 'ls' commands.
 1031 
 1032 If there is no file or subtree at that path, 'git fast-import' will
 1033 instead report
 1034 
 1035 ====
 1036 	missing SP <path> LF
 1037 ====
 1038 
 1039 See ``Responses To Commands'' below for details about how to read
 1040 this output safely.
 1041 
 1042 `feature`
 1043 ~~~~~~~~~
 1044 Require that fast-import supports the specified feature, or abort if
 1045 it does not.
 1046 
 1047 ....
 1048 	'feature' SP <feature> ('=' <argument>)? LF
 1049 ....
 1050 
 1051 The <feature> part of the command may be any one of the following:
 1052 
 1053 date-format::
 1054 export-marks::
 1055 relative-marks::
 1056 no-relative-marks::
 1057 force::
 1058 	Act as though the corresponding command-line option with
 1059 	a leading `--` was passed on the command line
 1060 	(see OPTIONS, above).
 1061 
 1062 import-marks::
 1063 import-marks-if-exists::
 1064 	Like --import-marks except in two respects: first, only one
 1065 	"feature import-marks" or "feature import-marks-if-exists"
 1066 	command is allowed per stream; second, an --import-marks=
 1067 	or --import-marks-if-exists command-line option overrides
 1068 	any of these "feature" commands in the stream; third,
 1069 	"feature import-marks-if-exists" like a corresponding
 1070 	command-line option silently skips a nonexistent file.
 1071 
 1072 get-mark::
 1073 cat-blob::
 1074 ls::
 1075 	Require that the backend support the 'get-mark', 'cat-blob',
 1076 	or 'ls' command respectively.
 1077 	Versions of fast-import not supporting the specified command
 1078 	will exit with a message indicating so.
 1079 	This lets the import error out early with a clear message,
 1080 	rather than wasting time on the early part of an import
 1081 	before the unsupported command is detected.
 1082 
 1083 notes::
 1084 	Require that the backend support the 'notemodify' (N)
 1085 	subcommand to the 'commit' command.
 1086 	Versions of fast-import not supporting notes will exit
 1087 	with a message indicating so.
 1088 
 1089 done::
 1090 	Error out if the stream ends without a 'done' command.
 1091 	Without this feature, errors causing the frontend to end
 1092 	abruptly at a convenient point in the stream can go
 1093 	undetected.  This may occur, for example, if an import
 1094 	front end dies in mid-operation without emitting SIGTERM
 1095 	or SIGKILL at its subordinate git fast-import instance.
 1096 
 1097 `option`
 1098 ~~~~~~~~
 1099 Processes the specified option so that git fast-import behaves in a
 1100 way that suits the frontend's needs.
 1101 Note that options specified by the frontend are overridden by any
 1102 options the user may specify to git fast-import itself.
 1103 
 1104 ....
 1105     'option' SP <option> LF
 1106 ....
 1107 
 1108 The `<option>` part of the command may contain any of the options
 1109 listed in the OPTIONS section that do not change import semantics,
 1110 without the leading `--` and is treated in the same way.
 1111 
 1112 Option commands must be the first commands on the input (not counting
 1113 feature commands), to give an option command after any non-option
 1114 command is an error.
 1115 
 1116 The following command-line options change import semantics and may therefore
 1117 not be passed as option:
 1118 
 1119 * date-format
 1120 * import-marks
 1121 * export-marks
 1122 * cat-blob-fd
 1123 * force
 1124 
 1125 `done`
 1126 ~~~~~~
 1127 If the `done` feature is not in use, treated as if EOF was read.
 1128 This can be used to tell fast-import to finish early.
 1129 
 1130 If the `--done` command-line option or `feature done` command is
 1131 in use, the `done` command is mandatory and marks the end of the
 1132 stream.
 1133 
 1134 RESPONSES TO COMMANDS
 1135 ---------------------
 1136 New objects written by fast-import are not available immediately.
 1137 Most fast-import commands have no visible effect until the next
 1138 checkpoint (or completion).  The frontend can send commands to
 1139 fill fast-import's input pipe without worrying about how quickly
 1140 they will take effect, which improves performance by simplifying
 1141 scheduling.
 1142 
 1143 For some frontends, though, it is useful to be able to read back
 1144 data from the current repository as it is being updated (for
 1145 example when the source material describes objects in terms of
 1146 patches to be applied to previously imported objects).  This can
 1147 be accomplished by connecting the frontend and fast-import via
 1148 bidirectional pipes:
 1149 
 1150 ====
 1151 	mkfifo fast-import-output
 1152 	frontend <fast-import-output |
 1153 	git fast-import >fast-import-output
 1154 ====
 1155 
 1156 A frontend set up this way can use `progress`, `get-mark`, `ls`, and
 1157 `cat-blob` commands to read information from the import in progress.
 1158 
 1159 To avoid deadlock, such frontends must completely consume any
 1160 pending output from `progress`, `ls`, `get-mark`, and `cat-blob` before
 1161 performing writes to fast-import that might block.
 1162 
 1163 CRASH REPORTS
 1164 -------------
 1165 If fast-import is supplied invalid input it will terminate with a
 1166 non-zero exit status and create a crash report in the top level of
 1167 the Git repository it was importing into.  Crash reports contain
 1168 a snapshot of the internal fast-import state as well as the most
 1169 recent commands that lead up to the crash.
 1170 
 1171 All recent commands (including stream comments, file changes and
 1172 progress commands) are shown in the command history within the crash
 1173 report, but raw file data and commit messages are excluded from the
 1174 crash report.  This exclusion saves space within the report file
 1175 and reduces the amount of buffering that fast-import must perform
 1176 during execution.
 1177 
 1178 After writing a crash report fast-import will close the current
 1179 packfile and export the marks table.  This allows the frontend
 1180 developer to inspect the repository state and resume the import from
 1181 the point where it crashed.  The modified branches and tags are not
 1182 updated during a crash, as the import did not complete successfully.
 1183 Branch and tag information can be found in the crash report and
 1184 must be applied manually if the update is needed.
 1185 
 1186 An example crash:
 1187 
 1188 ====
 1189 	$ cat >in <<END_OF_INPUT
 1190 	# my very first test commit
 1191 	commit refs/heads/master
 1192 	committer Shawn O. Pearce <spearce> 19283 -0400
 1193 	# who is that guy anyway?
 1194 	data <<EOF
 1195 	this is my commit
 1196 	EOF
 1197 	M 644 inline .gitignore
 1198 	data <<EOF
 1199 	.gitignore
 1200 	EOF
 1201 	M 777 inline bob
 1202 	END_OF_INPUT
 1203 
 1204 	$ git fast-import <in
 1205 	fatal: Corrupt mode: M 777 inline bob
 1206 	fast-import: dumping crash report to .git/fast_import_crash_8434
 1207 
 1208 	$ cat .git/fast_import_crash_8434
 1209 	fast-import crash report:
 1210 	    fast-import process: 8434
 1211 	    parent process     : 1391
 1212 	    at Sat Sep 1 00:58:12 2007
 1213 
 1214 	fatal: Corrupt mode: M 777 inline bob
 1215 
 1216 	Most Recent Commands Before Crash
 1217 	---------------------------------
 1218 	  # my very first test commit
 1219 	  commit refs/heads/master
 1220 	  committer Shawn O. Pearce <spearce> 19283 -0400
 1221 	  # who is that guy anyway?
 1222 	  data <<EOF
 1223 	  M 644 inline .gitignore
 1224 	  data <<EOF
 1225 	* M 777 inline bob
 1226 
 1227 	Active Branch LRU
 1228 	-----------------
 1229 	    active_branches = 1 cur, 5 max
 1230 
 1231 	  pos  clock name
 1232 	  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1233 	   1)      0 refs/heads/master
 1234 
 1235 	Inactive Branches
 1236 	-----------------
 1237 	refs/heads/master:
 1238 	  status      : active loaded dirty
 1239 	  tip commit  : 0000000000000000000000000000000000000000
 1240 	  old tree    : 0000000000000000000000000000000000000000
 1241 	  cur tree    : 0000000000000000000000000000000000000000
 1242 	  commit clock: 0
 1243 	  last pack   :
 1244 
 1245 
 1246 	-------------------
 1247 	END OF CRASH REPORT
 1248 ====
 1249 
 1250 TIPS AND TRICKS
 1251 ---------------
 1252 The following tips and tricks have been collected from various
 1253 users of fast-import, and are offered here as suggestions.
 1254 
 1255 Use One Mark Per Commit
 1256 ~~~~~~~~~~~~~~~~~~~~~~~
 1257 When doing a repository conversion, use a unique mark per commit
 1258 (`mark :<n>`) and supply the --export-marks option on the command
 1259 line.  fast-import will dump a file which lists every mark and the Git
 1260 object SHA-1 that corresponds to it.  If the frontend can tie
 1261 the marks back to the source repository, it is easy to verify the
 1262 accuracy and completeness of the import by comparing each Git
 1263 commit to the corresponding source revision.
 1264 
 1265 Coming from a system such as Perforce or Subversion this should be
 1266 quite simple, as the fast-import mark can also be the Perforce changeset
 1267 number or the Subversion revision number.
 1268 
 1269 Freely Skip Around Branches
 1270 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1271 Don't bother trying to optimize the frontend to stick to one branch
 1272 at a time during an import.  Although doing so might be slightly
 1273 faster for fast-import, it tends to increase the complexity of the frontend
 1274 code considerably.
 1275 
 1276 The branch LRU builtin to fast-import tends to behave very well, and the
 1277 cost of activating an inactive branch is so low that bouncing around
 1278 between branches has virtually no impact on import performance.
 1279 
 1280 Handling Renames
 1281 ~~~~~~~~~~~~~~~~
 1282 When importing a renamed file or directory, simply delete the old
 1283 name(s) and modify the new name(s) during the corresponding commit.
 1284 Git performs rename detection after-the-fact, rather than explicitly
 1285 during a commit.
 1286 
 1287 Use Tag Fixup Branches
 1288 ~~~~~~~~~~~~~~~~~~~~~~
 1289 Some other SCM systems let the user create a tag from multiple
 1290 files which are not from the same commit/changeset.  Or to create
 1291 tags which are a subset of the files available in the repository.
 1292 
 1293 Importing these tags as-is in Git is impossible without making at
 1294 least one commit which ``fixes up'' the files to match the content
 1295 of the tag.  Use fast-import's `reset` command to reset a dummy branch
 1296 outside of your normal branch space to the base commit for the tag,
 1297 then commit one or more file fixup commits, and finally tag the
 1298 dummy branch.
 1299 
 1300 For example since all normal branches are stored under `refs/heads/`
 1301 name the tag fixup branch `TAG_FIXUP`.  This way it is impossible for
 1302 the fixup branch used by the importer to have namespace conflicts
 1303 with real branches imported from the source (the name `TAG_FIXUP`
 1304 is not `refs/heads/TAG_FIXUP`).
 1305 
 1306 When committing fixups, consider using `merge` to connect the
 1307 commit(s) which are supplying file revisions to the fixup branch.
 1308 Doing so will allow tools such as 'git blame' to track
 1309 through the real commit history and properly annotate the source
 1310 files.
 1311 
 1312 After fast-import terminates the frontend will need to do `rm .git/TAG_FIXUP`
 1313 to remove the dummy branch.
 1314 
 1315 Import Now, Repack Later
 1316 ~~~~~~~~~~~~~~~~~~~~~~~~
 1317 As soon as fast-import completes the Git repository is completely valid
 1318 and ready for use.  Typically this takes only a very short time,
 1319 even for considerably large projects (100,000+ commits).
 1320 
 1321 However repacking the repository is necessary to improve data
 1322 locality and access performance.  It can also take hours on extremely
 1323 large projects (especially if -f and a large --window parameter is
 1324 used).  Since repacking is safe to run alongside readers and writers,
 1325 run the repack in the background and let it finish when it finishes.
 1326 There is no reason to wait to explore your new Git project!
 1327 
 1328 If you choose to wait for the repack, don't try to run benchmarks
 1329 or performance tests until repacking is completed.  fast-import outputs
 1330 suboptimal packfiles that are simply never seen in real use
 1331 situations.
 1332 
 1333 Repacking Historical Data
 1334 ~~~~~~~~~~~~~~~~~~~~~~~~~
 1335 If you are repacking very old imported data (e.g. older than the
 1336 last year), consider expending some extra CPU time and supplying
 1337 --window=50 (or higher) when you run 'git repack'.
 1338 This will take longer, but will also produce a smaller packfile.
 1339 You only need to expend the effort once, and everyone using your
 1340 project will benefit from the smaller repository.
 1341 
 1342 Include Some Progress Messages
 1343 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1344 Every once in a while have your frontend emit a `progress` message
 1345 to fast-import.  The contents of the messages are entirely free-form,
 1346 so one suggestion would be to output the current month and year
 1347 each time the current commit date moves into the next month.
 1348 Your users will feel better knowing how much of the data stream
 1349 has been processed.
 1350 
 1351 
 1352 PACKFILE OPTIMIZATION
 1353 ---------------------
 1354 When packing a blob fast-import always attempts to deltify against the last
 1355 blob written.  Unless specifically arranged for by the frontend,
 1356 this will probably not be a prior version of the same file, so the
 1357 generated delta will not be the smallest possible.  The resulting
 1358 packfile will be compressed, but will not be optimal.
 1359 
 1360 Frontends which have efficient access to all revisions of a
 1361 single file (for example reading an RCS/CVS ,v file) can choose
 1362 to supply all revisions of that file as a sequence of consecutive
 1363 `blob` commands.  This allows fast-import to deltify the different file
 1364 revisions against each other, saving space in the final packfile.
 1365 Marks can be used to later identify individual file revisions during
 1366 a sequence of `commit` commands.
 1367 
 1368 The packfile(s) created by fast-import do not encourage good disk access
 1369 patterns.  This is caused by fast-import writing the data in the order
 1370 it is received on standard input, while Git typically organizes
 1371 data within packfiles to make the most recent (current tip) data
 1372 appear before historical data.  Git also clusters commits together,
 1373 speeding up revision traversal through better cache locality.
 1374 
 1375 For this reason it is strongly recommended that users repack the
 1376 repository with `git repack -a -d` after fast-import completes, allowing
 1377 Git to reorganize the packfiles for faster data access.  If blob
 1378 deltas are suboptimal (see above) then also adding the `-f` option
 1379 to force recomputation of all deltas can significantly reduce the
 1380 final packfile size (30-50% smaller can be quite typical).
 1381 
 1382 
 1383 MEMORY UTILIZATION
 1384 ------------------
 1385 There are a number of factors which affect how much memory fast-import
 1386 requires to perform an import.  Like critical sections of core
 1387 Git, fast-import uses its own memory allocators to amortize any overheads
 1388 associated with malloc.  In practice fast-import tends to amortize any
 1389 malloc overheads to 0, due to its use of large block allocations.
 1390 
 1391 per object
 1392 ~~~~~~~~~~
 1393 fast-import maintains an in-memory structure for every object written in
 1394 this execution.  On a 32 bit system the structure is 32 bytes,
 1395 on a 64 bit system the structure is 40 bytes (due to the larger
 1396 pointer sizes).  Objects in the table are not deallocated until
 1397 fast-import terminates.  Importing 2 million objects on a 32 bit system
 1398 will require approximately 64 MiB of memory.
 1399 
 1400 The object table is actually a hashtable keyed on the object name
 1401 (the unique SHA-1).  This storage configuration allows fast-import to reuse
 1402 an existing or already written object and avoid writing duplicates
 1403 to the output packfile.  Duplicate blobs are surprisingly common
 1404 in an import, typically due to branch merges in the source.
 1405 
 1406 per mark
 1407 ~~~~~~~~
 1408 Marks are stored in a sparse array, using 1 pointer (4 bytes or 8
 1409 bytes, depending on pointer size) per mark.  Although the array
 1410 is sparse, frontends are still strongly encouraged to use marks
 1411 between 1 and n, where n is the total number of marks required for
 1412 this import.
 1413 
 1414 per branch
 1415 ~~~~~~~~~~
 1416 Branches are classified as active and inactive.  The memory usage
 1417 of the two classes is significantly different.
 1418 
 1419 Inactive branches are stored in a structure which uses 96 or 120
 1420 bytes (32 bit or 64 bit systems, respectively), plus the length of
 1421 the branch name (typically under 200 bytes), per branch.  fast-import will
 1422 easily handle as many as 10,000 inactive branches in under 2 MiB
 1423 of memory.
 1424 
 1425 Active branches have the same overhead as inactive branches, but
 1426 also contain copies of every tree that has been recently modified on
 1427 that branch.  If subtree `include` has not been modified since the
 1428 branch became active, its contents will not be loaded into memory,
 1429 but if subtree `src` has been modified by a commit since the branch
 1430 became active, then its contents will be loaded in memory.
 1431 
 1432 As active branches store metadata about the files contained on that
 1433 branch, their in-memory storage size can grow to a considerable size
 1434 (see below).
 1435 
 1436 fast-import automatically moves active branches to inactive status based on
 1437 a simple least-recently-used algorithm.  The LRU chain is updated on
 1438 each `commit` command.  The maximum number of active branches can be
 1439 increased or decreased on the command line with --active-branches=.
 1440 
 1441 per active tree
 1442 ~~~~~~~~~~~~~~~
 1443 Trees (aka directories) use just 12 bytes of memory on top of the
 1444 memory required for their entries (see ``per active file'' below).
 1445 The cost of a tree is virtually 0, as its overhead amortizes out
 1446 over the individual file entries.
 1447 
 1448 per active file entry
 1449 ~~~~~~~~~~~~~~~~~~~~~
 1450 Files (and pointers to subtrees) within active trees require 52 or 64
 1451 bytes (32/64 bit platforms) per entry.  To conserve space, file and
 1452 tree names are pooled in a common string table, allowing the filename
 1453 ``Makefile'' to use just 16 bytes (after including the string header
 1454 overhead) no matter how many times it occurs within the project.
 1455 
 1456 The active branch LRU, when coupled with the filename string pool
 1457 and lazy loading of subtrees, allows fast-import to efficiently import
 1458 projects with 2,000+ branches and 45,114+ files in a very limited
 1459 memory footprint (less than 2.7 MiB per active branch).
 1460 
 1461 SIGNALS
 1462 -------
 1463 Sending *SIGUSR1* to the 'git fast-import' process ends the current
 1464 packfile early, simulating a `checkpoint` command.  The impatient
 1465 operator can use this facility to peek at the objects and refs from an
 1466 import in progress, at the cost of some added running time and worse
 1467 compression.
 1468 
 1469 SEE ALSO
 1470 --------
 1471 linkgit:git-fast-export[1]
 1472 
 1473 GIT
 1474 ---
 1475 Part of the linkgit:git[1] suite