Files and Streams Part 2 - Managing Files & Directories

Published on 05 May 2020

Hero image

The previous part was a brief overview and links to other parts. In this section I'm providing an example of basic file and directory management within a console application. Essentially setting up a desired directory structure and then moving or removing files depending on the required behaviour. This is made possible by three specific System.IO classes.

.NET Methods

ready cat

Lets do this!

Setup


To start I created a .NET framework console app project (.net 4.8). So I have created a command line application that takes 2 arguments "-f" and "-d". If you enter "-f" it will run the single file process and if "-d" is entered the multi file process is started (file and directory).

private static void Main(string[] args)
{
	var command = args[0];

	switch (command)
	{
		switch (command)
		{
			case "-f":
				RunFileCommand(args[1]);
				break;
			case "-d":
				RunDirectoryCommand(args[1], args[2]);
				break;
			default:
				WriteLine("Invalid command line options");
				break;
		}
	}

	WriteLine("Press enter key to quit.");
	ReadLine();
}

The single file process will output a message and begin the file processing for a single item path.

private static void RunFileCommand(string path)
{
	WriteLine($"Single file {path} selected");
	var fileProcessor = new FileProcessor(path);
	fileProcessor.Process();
}

The multi file process will output relevant messages, iterate through all the text files and perform the file process on each of them. At this point, this is the first time a System.IO method is used, utilising the Directory class. Here the GetFiles method will provide a string array of files in the provided path.

private static void RunDirectoryCommand(string path, string type)
{
	WriteLine($"Directory {path} selected for {type} files");
	switch (type)
	{
		case "txt":
			var textFiles = Directory.GetFiles(path, "*.txt");
			foreach (var textFilePath in textFiles)
			{
				var fileProcessor = new FileProcessor(textFilePath);
				fileProcessor.Process();
			}

			break;
		default:
			WriteLine($"Error: {type} is not a supported file type.");
			return;
	}
}

The File Processor is a simple class that requires a path passed into it. It will throw an exception if no path value is provided. It determines the file name and parent directory from this path value in the constructor. It does this using the GetFileName and GetDirectoryName methods.

Once this class has been instantiated the Process method can be ran. The File class is used to check if the file actually exists using the Exists method (quite self-explanatory really). Currently this behaviour just outputs messages to the console and checks for the existence of a file.

internal class FileProcessor
{
	private const string OriginalDir = "original";
	private const string ProcessDir = "processing";
	private const string FinalDir = "processed";

	private readonly string _filePath = string.Empty;
	private readonly string _rootPath = string.Empty;
	private readonly string _fileName = string.Empty; 

	public FileProcessor()
	{
		throw new ArgumentException("FileProcessor must be instantiated with path parameter.");
	}

	public FileProcessor(string path)
	{
		_filePath = path;
		_fileName = Path.GetFileName(_filePath);
		_rootPath = Path.GetDirectoryName(_filePath);
	}

	public void Process()
	{
		WriteLine($"Begin processing of {_filePath}");

		if (!File.Exists(_filePath))
		{
			WriteLine($"Error: {_filePath} does not exist.");
			return;
		}
	}
}

Backup Original


The first stage of this process needs to take the referenced file and back it up to a folder. In the Process method, it uses a new method called BackupOriginal.

public void Process()
{
	WriteLine($"Begin processing of {_filePath}");

	if (!File.Exists(_filePath))
	{
		WriteLine($"Error: {_filePath} does not exist.");
		return;
	}

	BackupOriginal();
}

The BackupOriginal method utilises the Path, Directory and File System.IO classes.

The Combine method requires an absolute path as its first argument then any subsequent string values will be combined to it, to produce a new path. This method has overloads that will combine up to four strings or a single string array.

The CreateDirectory method will take a path and create it, if it does not already exist. If the path already exists it will not throw an error it will just return the directory that already exists as a DirectoryInfo object. So this will ensure there is a directory to work with.

The Copy method takes a path, a destination path and a boolean to decide if items can be overwritten during the copy.

This method will make sure the directory exists to back up the original item to and then copy the item to this declared directory. In the class, the name of this folder is declared as "original".

private void BackupOriginal()
{
	var originalDir = Path.Combine(_rootPath, OriginalDir);

	Directory.CreateDirectory(originalDir);

	var tempFilePath = Path.Combine(originalDir, _fileName);

	WriteLine($"Copying {_filePath} to {tempFilePath}");

	File.Copy(sourceFileName:_filePath, destFileName:tempFilePath, overwrite:true);
}

Move to Processing


The second stage of this process will take the referenced file and move it to the processing folder. The Process method has been modified to use the new MoveToProcessing method. This will return the location of the file to be processed in the new location.

public void Process()
{
	WriteLine($"Begin processing of {_filePath}");

	if (!File.Exists(_filePath))
	{
		WriteLine($"Error: {_filePath} does not exist.");
		return;
	}

	BackupOriginal();

	var processingFilePath = MoveToProcessing();

	if (string.IsNullOrEmpty(processingFilePath))
	{
		return;
	}
}

The MoveToProcessing method will create the processing directory if it does not already exist. It will check if the item already exists in the processing directory. If it does not already exist the file will be moved to the processing folder using the Move method. A simple operation that requires the path of an item and the desired path of the item.

private string MoveToProcessing()
{
	Directory.CreateDirectory(Path.Combine(_rootPath, ProcessDir));

	var processingFilePath = Path.Combine(_rootPath, ProcessDir, _fileName);

	if (File.Exists(processingFilePath))
	{
		WriteLine($"Error: {processingFilePath} is already being processed.");
		return string.Empty;
	}

	WriteLine($"Moving {_filePath} to {processingFilePath}");

	// no way to overwrite if file already exists in new location
	File.Move(_filePath, processingFilePath);

	return processingFilePath;
}

Process File & Remove Processing


Now that the file has been backed up and moved to the correct location. It can now be processed. Below the ProcessFile and RemoveProcessing methods have been added to the Process method.

public void Process()
{
	WriteLine($"Begin processing of {_filePath}");

	if (!File.Exists(_filePath))
	{
		WriteLine($"Error: {_filePath} does not exist.");
		return;
	}

	BackupOriginal();

	var processingFilePath = MoveToProcessing();

	if (string.IsNullOrEmpty(processingFilePath))
	{
		return;
	}

	ProcessFile(processingFilePath);

	RemoveProcessing(processingFilePath);
}

The ProcessFile method just requires the processing file path. At this point, it will do a check on the extension of the file. Currently there is only txt file specific behaviour. It does this using the GetExtension method. You can see the anticipated return value in the switch statement. The ProcessTextFile method does nothing but report that the file is being processed currently. Any modifications to this can be done another time, this post is about managing files, the editing files section will be handled elsewhere.

Once the file has finished being processed, it will create the final directory ready to place the processed file inside. A new file name will be computed using the GetFileNameWithoutExtension method alongside the System NewGuid method. The result of this would be a new unique file name (just incase a file needs to processed multiple times). The ChangeExtension method is also an option to be used if preferred (maybe the .done extension could work).

Whichever naming method is used the processed file will then be moved to the "finished" directory.

private void ProcessFile(string processingFilePath)
{
	var extension = Path.GetExtension(_filePath);

	switch (extension)
	{
		case ".txt":
			ProcessTextFile(processingFilePath);
			break;
		default:
			WriteLine($"{extension} is not a supported file type.");
			break;
	}

	var finishedDir = Path.Combine(_rootPath, FinalDir);
	Directory.CreateDirectory(finishedDir);

	WriteLine($"Moving {processingFilePath} to {finishedDir}");

	var finishedFileName = $"{Path.GetFileNameWithoutExtension(_filePath)}-{Guid.NewGuid()}{extension}";
	var finishedPath = Path.Combine(finishedDir, finishedFileName);
	File.Move(processingFilePath, finishedPath);
}

private void ProcessTextFile(string processingPath)
{
	// Insert Processing Logic
	WriteLine($"Processing {processingPath}");
}

Once the processed file has been moved, there is no need for the "processed" directory. Using the Delete method will remove everything inside the directory and then the directory itself.

private void RemoveProcessing(string processingFilePath)
{
	var processingPath = Path.GetDirectoryName(processingFilePath);
	if (!string.IsNullOrEmpty(processingPath))
	{
		Directory.Delete(processingPath, recursive:true);
	}
}

Run the command


Before the command is run, there is a need for data to process. Here is an example of some text files that was used. Files can be copied into the path directory to be processed.

text files

Once there is a file to process, it can be referenced in the start options in the console project settings.

command line arguments file

Starting the console app will trigger the file process. The output can be observed in the command line window.

command line file output

Inside the path directory, now there will be 2 folders. 1 with the backed up text file and the other with the finished processed file.

folder output

As you can see, the file has been processed and has been changed to have the guid in file name.

file output

Using the directory command will exhibit different behaviour. This can be noticed by changing the command line arguments to use the -d option. The file was removed from the path and the file type is also referenced. In preparation for this, I cleared the path directory and copied all of the files into the path directory.

command line arguments directory

Running the console app now will display more information, as all of the items get processed.

command line directory output

The result of this is all of the files being processed with new file names.

folder output

Summary


The full code for this can be seen on my github. In short I have put together a very basic example of how files and directories can be managed programmatically. The 2 set behaviours were to manage a single file or a collection of files. Both are easily achievable with the System.IO namespace. The actual processing of these files will be witnessed in the parts to come.

In Part 3 I'll expand on this by providing an example on how this can be monitored, rather than being executed on a case by case basis.

But for now.

tired cat

I'm done.