The examples below uses the standard dvdrecord program, but other tools exist and are similiar. mrbill has a gui front end to dvdrecord called 'xcdroast', but it's use will not be covered here.
At the end of this page is a script that can be used to determine the total file size of all files in the current and sub-directories. The script will give the user specific file information for where one reaches the 4.3GB limit. For those that are dealing with multiple directories and lots of data, it could be handy. The script can be found on both sand and mrbill, but you can also cut & paste the text into your own gawk or awk file. See the section called 'Script for Counting File Size' for it's use.
An quick note on file size reporting:
The standard size quoted for DVD is ~4.7GB, where a GigaByte is defined as 10003 bytes. In contrast, a DVD's storage is 4.3GB where a GigaByte is 10243 bytes. This can cause some confusion. Below are examples of 'ls' and the results given. Note that the standard, long-listing reports a file size in 10003 format, but that there are two subsequent forms of the 'human readable' formats from 'ls' .
% ls -l TEST.iso
-rw-rw-r-- 1 wiyn_ccd pppusers 4499963904 Nov 19 10:09 TEST.iso #10003 Format% ls -lH TEST.iso
-rw-rw-r-- 1 wiyn_ccd pppusers 4.5G Nov 19 10:09 TEST.iso #10003 Format% ls -lh TEST.iso
-rw-rw-r-- 1 wiyn_ccd pppusers 4.2G Nov 19 10:09 TEST.iso #10243 Format
(b) Run the "mkisofs" command to create
the ISO file, iso9660 format:
where:
-J Generate Joliet directory records
in addition to the iso9660 filenames
-l Use full 31 character filenames
-R Generate RockRidge protocal records
-V Use a volume identification, 'my data
files' as example. Use single quotes if white spaces are to be used.
(b) Burn the ISO file to the DVD (about
20min for a full disk):
where:
(b) list the files, read some, copy them all to /dev/null or anything else to verify the saved files. You may also want to load the disk in a laptop or other machine to verify.
(c) unmount the disk
You must not be in the /mnt/dvdrom directory when issuing
the 'umount' command. The system will complain the the device is busy and will not
unmount the device.
(d) Since the disks are so cheap, it probably
makes sense to make a second copy by repeating step 2.
The output of that command for tan is:
2,0,0 200) 'ASUS ' 'CRW-4012A ' '1.0 ' Removable CD-ROM
2,1,0 201) 'PIONEER ' 'DVD-RW DVR-106D' '1.06'
Removable CD-ROM
Located on SAND and mrbill is a gawk script that will recursively march through a directory and all sub-directories, counting the file size and reporting when and where the DVD-R file limit of 4.3GB is reached (10243). For most, no sub-directories will exist, but maybe not. The script is intended to be a simple tool to help those who are burning DVD's identify when the limit is reached.
Name: dvdfiles
Synopsis: ls
-Rl | dvdfiles or
dvdfiles < ls -Rl
or dvdfiles
< [filename]
Description:
dvdfiles is a gawk script written with the assumption that the input is the result of running the linux command 'ls' with the -Rl (recursively search all subdirectories and format using the long listing, all detail). You can choose to work with only the current directory and not use the 'R', or recursive, argument.The script reports the directory that it is working in and it's size. It then begins summing file sizes only, ignoring directory size. When the limit of 4.3GB is reached, a report is generated to indicate where this limit has been reached at, directory and filename. It will then reset it's counters and begin at the next file, continuing to march on. At the end, the cummulative size of all files is reported.
Below is an example output for a directory with image data:
sand:/data2/wttm/% ls -Rl | dvdfiles
0.000MB in .. current directory #Starts in the current directory and begins to march recursively
0.079MB in ./01Jun03:
0.602MB in ./02Dec03:
0.158MB in ./02Jun03:
0.494MB in ./03Dec03:
0.020MB in ./03Jun03:
0.049MB in ./09Jul03:
0.652MB in ./09Jun03:
0.326MB in ./10Jul03:
0.573MB in ./10Jun03:
0.415MB in ./11Jun03:
0.444MB in ./14Apr03:
0.194MB in ./14Feb03:
0.115MB in ./14Feb03/junk:
0.049MB in ./14Feb03/Raw:
1.017MB in ./15Apr03: # 1.017GB of files are in sub-directory 15Apr03.
Limit of 4.300GB Reached
Total:4.300GB File#:431 at file:n3014.fits #Sum has reached 4.3GB limit on the 431 file, sub-directory 15Apr03, file n3014.fits
#Not all of sub-directory 15Apr03 can be burned to DVD with the previous directories. You could just burn directories up to 15Apr03, or be really efficient and burn up to file n3014.fits, but no more for a single DVD.Reseting counters to begin with File ./15Apr03: n3015.fits
1.796MB in ./15May03:
1.737MB in ./16May03:
Limit of 4.300GB Reached
Total:4.298GB File#:258 at file:obj077.fits
Reseting counters to begin with File ./16May03: test.fits
0.948MB in ./17May03:
0.276MB in ./18Feb03:
0.079MB in ./19mar03:
1.728MB in ./19Sep02:
0.197MB in ./20Mar03:
0.987MB in ./21Apr03:
Limit of 4.300GB Reached
Total:4.298GB File#:390 at file:d1099.fits
Reseting counters to begin with File ./21Apr03: d1100.fits
0.652MB in ./21Sept02:
0.010MB in ./Database:
0.691MB in ./M67:
TOTAL for Files: 1.370GB File#:173
DONETOTAL for ALL Files: 12.895 GigaBytes
DONE
#!/bin/gawk -f#The following script is a simple task to help recursively
#search a directory structure, sum up the file size, and
#provide feedback when the sum of the filesizes reaches
#the 4.3GB limit of a DVD-R disk (1024 block size,
#as compared to 4.7GB 1000 block size). File size is
#counted up and to the 4.3GB limit to maximize on DVD
#storage.
#The script MUST BE FED by the linux command 'ls -Rl'.
#in order to count all subdirectories. If you
#wanted to work only within the current directory, you
#could invoke only 'ls -l'.#Examples of invoking the script:
# ls -Rl | dvdfiles
#or
# dvdfiles < filename where filename
#is the output of 'ls -Rl', or just ls -Rl itself.
# gawk and this script then assums that the file
#size is reported in column 5, and in decimal bytes
#and the file name in column 9. Files are assumed
#it have the following designation in column 1
#'-rw-rw-r--' i.e., the leading character is a '-'
#and not a directory flag 'd'. If the number of
#columns, or fields, is one then, it is assumed
#to be the directory name. If the field number
#is 2 then it is assumed to be the total content
#of the directory.
BEGIN { FS= " "
SUM=0
sum=0
FileNum=0
DirName=".. current directory"
SUMFLAG=0
PARTIALFLAG=0
printf "\n\n"} #white-space as field seperator, init variables.
NF==1 {DirName=$1
if(DirName == ".:")
DirName=".. current directory"}NF==2 && /^t/ {printf "%5.3fMB in %s\n", $2/1024^2,DirName }
NF==9 && /^-/ {PARTIALFLAG=0
if( (sum+$5)<=4.3*1024^3){
sum+=$5
file=$9
++FileNum
}else{
SUMFLAG=1
PARTIALFLAG=1SUM+=sum
printf "\tLimit of 4.300GB Reached \n"
printf "\tTotal:%.3fGB File#:%d at file:%s\n",sum/1024^3,FileNum,file
printf "\n\n"sum=$5 #to keep up with current record
FileNum=1
file=$9
printf "Reseting counters to begin with File %s %s\n",DirName,file
}}
END { if(PARTIALFLAG==0){
printf "\tTOTAL for Files: %.3fGB File#:%d \n",sum/1024^3,FileNum
printf "\tDONE\n"}
if(SUMFLAG==1) {
printf "\nTOTAL for ALL Files: %.3f GigaBytes\n", SUM/1024^3
printf "DONE\n"}
}
Charles Corson
December 2004