[FORTRAN] Writing to same file from multiple processors?

In summary, There is a need to avoid writing to the same file at the same time while working in a parallel environment in FORTRAN. The best solution suggested is to break the file into several files, one per processor, and name them by processor. This can be done by using the getpid() function from FORTRAN to determine which processor is doing the writing. Other suggestions include using a messaging system or multiple hard drives, but there may be issues with mapping process ids to hard drives and the speed of data generation. There may also be a need to use semaphore locks during file operations or disable multi-threading.
  • #1
tav98f
6
0
I am working in FORTRAN
I need to avoid writing to the same file at the same time while working in a parallel environment.

I have a program (that I have not written and cannot edit) that calls a subroutine (that I have written and can edit). The subroutine is called over and over again by several processors working at the same time. I am having issues with more than one processor writing to the same file at the same time.

I need a way to avoid writing to the file at the same time.

The best way that I can think of is to break the file into several files; one per processor and name them by processor. However, I need some way to determine which processor is doing the writing.

Does anyone know how to do that?
Or does anyone have any other suggestions?

Thank you,
Timothy
 
Technology news on Phys.org
  • #2
I think there are many methods to solve this. From the top of my mind - allocate a specified memory space of flags, that each processor triggers while writing.
 
  • #3
I am still very new to FORTRAN and from many people's view programming in general. When you say a "memory space" I am not sure what you mean.

Can you explain this suggestion in more detail?

If my notion of memory space is correct then I should remind you that I have no control over the program that is calling my subroutine.
 
  • #4
I think I got it. The program that calls my subroutine keeps the same process ID for each processor running the program. I think I can use the getpid() function from FORTRAN and write to a file named based on the integer that getpid() returns.

Thank you for your time,
Timothy Van Rhein
 
  • #5
I was thinking you could use some type of messaging system to send all writes to a single thread that the subroutine creates (requires a global variable to know when the first call is made to know when to create the thread), but without having the main program calling a second subroutine when the main program has completed, there would be no way to flush out all the pending writes and close the file.

Even with the multiple file scheme based on getpid(), how will your program know when to close all those files, or will your program open a file, append data, and close a file on each call?

Another advantage of writing multiple files is if the program is generating data faster than a single hard drive can write. In this case, if you have multiple hard drives, you can keep the separate files on separate hard drives.
 
Last edited:
  • #6
The subroutine will open a file, append data, and close a file on each call. I did not think about using multiple hard drives. I will consider that.

Thanks for the reply,
Timothy Van Rhein
 
  • #7
tav98f said:
The subroutine will open a file, append data, and close a file on each call. I did not think about using multiple hard drives.
An issue with this is how to map each process id into a hard drive and file name when you don't know the process ids in advance. I assume process ids are 32 bit (or larger) values, so you can't use a huge array to do the mapping. You can use a global array with a size equal to the maximum number of processes the program could be running at one time, then search for and/or add (if no id found) to the array, each time the subroutine is called. Then use the index to the entry in the array with the current process id as part of the hard drive / file name. Unless there are a large number of processes, the overhead of this seach and/or add scheme would be small compared to the actual write time to the hard drives. You'll also need a global index for the next available (empty) entry in the array. This global index would be initially be zero, indicating an empty process id array. The search loop would only search for indexes less than the global index (so the first time it's called, there is no search because the global index is initially zero, indicating an empty array). Each time you add an process id to the array, you'd increment the global index.
 
Last edited:
  • #8
Hmmmm... With these issues in mind I may not write to multiple drives. I only have a maximum of three drive available. One hard drive in the tower and two externals. I could also use network drives but I question whether that would be any faster at all. I really do not have any experience doing these things.

Thanks for the thought,
Timothy
 
  • #9
tav98f said:
With these issues in mind I may not write to multiple drives. I only have a maximum of three drive available. One hard drive in the tower and two externals. I could also use network drives but I question whether that would be any faster at all.
Depends on the speed of the external interface or network, and the overhead involved. I don't know how fast data is being generated by your program, but assuming it isn't capturing data in real time from some instrumented device or other fixed rate device at a high data rate it shouldn't be a problem, other than it may throttle the rate at which the program runs if the program generates data faster than the writes can occur. If speed was an issue, you could consider a raid setup in your tower to utilize multiple hard drives.

I assume your library functions for open, write, and close file support a multi-processing pre-emptive environment, if not, you would need to disable multi-threading and/or use some type of semaphore lock during file operations, or use the messaging to a single process method I mentioned before.
 
  • #10
I am pretty sure the calling program takes care of all of that for me. I am running it and I don't seem to be running into any problems except for generating too much data. I need to make some revisions.

Thanks for the help,
Timothy Van Rhein
 

Related to [FORTRAN] Writing to same file from multiple processors?

1. Can multiple processors write to the same file simultaneously in FORTRAN?

Yes, multiple processors can write to the same file simultaneously in FORTRAN. This is known as parallel I/O and is a common technique used to improve the performance of scientific code.

2. How do I ensure data integrity when writing to the same file from multiple processors?

To ensure data integrity, it is important to use synchronization techniques such as locks or barriers to coordinate access to the file. These techniques ensure that only one processor is writing to the file at a time, preventing data corruption.

3. What is the best way to handle errors when writing to the same file from multiple processors?

The best way to handle errors is to use error handling routines provided by FORTRAN. These routines allow you to catch and handle errors that may occur during the writing process, ensuring that your code runs smoothly.

4. Are there any performance considerations when writing to the same file from multiple processors?

Yes, there are performance considerations when writing to the same file from multiple processors. These include the type of parallel I/O method used, the bandwidth of the file system, and the size and frequency of the data being written. It is important to choose the most efficient parallel I/O method and optimize your code for performance.

5. Can I use the same file for both reading and writing from multiple processors in FORTRAN?

Yes, it is possible to use the same file for both reading and writing from multiple processors in FORTRAN. However, it is important to carefully coordinate the access to the file to avoid conflicts and ensure data integrity.

Similar threads

  • Programming and Computer Science
Replies
2
Views
468
  • Programming and Computer Science
Replies
29
Views
2K
  • Programming and Computer Science
2
Replies
35
Views
862
  • Programming and Computer Science
Replies
8
Views
1K
  • Programming and Computer Science
2
Replies
50
Views
4K
  • Programming and Computer Science
Replies
4
Views
779
  • Programming and Computer Science
Replies
9
Views
896
  • Programming and Computer Science
Replies
8
Views
1K
  • Programming and Computer Science
2
Replies
57
Views
3K
  • Programming and Computer Science
Replies
4
Views
693
Back
Top