Friday, October 13, 2023

How to get all file names and folders in directory? How to store to it in a spreadsheet ? Solution is here

 Hi

I tried so many experiments to get the list of files and folders from a directory. 

I was trying to get the files because I want to study and complete it.

Since it was a very huge task to get the files name from each directory and to store it manually in spreadsheet.  I checked the Google AI, Bing AI to write a code in python.

After some struggles and mistakes, First time it wrote the file for small folders and very limited number for files.

So i took the help from Stakeoverflow (stakeoverflow) and it got solved.

Below I have given you the full code and sample. 

Python code:-  

import os
import openpyxl
import time

def list_files(directory):


  """Lists all files in the given directory and subdirectories.
  Args:
    directory: The directory to list.
  Returns:
    A list of all files in the given directory and subdirectories.
  """


  allfiles = []
  for root, _, files in os.walk(directory):
    for file in files:
      allfiles.append(os.path.join(root, file))
  return allfiles

def save_to_spreadsheet(files, spreadsheet_path):


  """Saves the given list of files to the given spreadsheet.
  Args:
    files: A list of files.
    spreadsheet_path: The path to the spreadsheet.
  """


  start_time = time.time()

  wb = openpyxl.Workbook()
 

  ws = wb.active

  # Write the header row.
  ws.cell(row=1, column=1).value = "Index"
  #ws.cell(row=1, column=2).value = "Header"
  ws.cell(row=1, column=2).value = "Folder Name"
  ws.cell(row=1, column=3).value = "Directory Path"
  ws.cell(row=1, column=4).value = "File Path"
  ws.cell(row=1, column=5).value = "File Name"
 # ws.cell(row=1, column=7).value = "Time Taken (seconds)"

  # Write the file data.
 

  for i, file in enumerate(files):
    ws.cell(row=i + 2, column=1).value = i + 1
    #ws.cell(row=i + 2, column=2).value = "File"
    ws.cell(row=i + 2, column=2).value = os.path.basename(os.path.dirname(file))
    ws.cell(row=i + 2, column=3).value = os.path.dirname(file)
    ws.cell(row=i + 2, column=4).value = file
    ws.cell(row=i + 2, column=5).value = os.path.basename(file)

    # Calculate the time taken to write the file.
    #write_time = time.time() - start_time
    #start_time = time.time()

    #ws.cell(row=i + 2, column=7).value = write_time

  # Save the spreadsheet.
  wb.save(spreadsheet_path)

if __name__ == "__main__":
 

  # Get the directory to list.
 

  directory = "D:\\torrents\education\\100 Days of Code The Complete Python Pro Bootcamp for 2023"

  # List the files in the directory and subdirectories.


  files = list_files(directory)

  # Save the file names to a spreadsheet.


  spreadsheet_path = "D:\\torrents\education\\100 Days of Code The Complete Python Pro Bootcamp for 2023\\spreadsheet.xlsx"
 

  save_to_spreadsheet(files, spreadsheet_path)
 

 


 

No comments:

How to Get files from the directory - One more method

 import os import openpyxl # Specify the target folder folder_path = "C:/Your/Target/Folder"  # Replace with the actual path # Cre...