FTF-Reformat-Dysfunctional-Report-Using-Power-Query-Modulo-Column

Fast Tip Friday – Reformat Dysfunctional Report Using Power Query Modulo Column

ByAmy Bowser-Rollins 02/17/201708/07/2021

This fast tip demonstrates how to use the Excel Power Query Modulo feature to clean up a report and make it more user friendly.

Download Sample Files

Fast Tip Friday

Fast Tip Friday – Convert Doc Number List to Searchable Bates Numbers

ByAmy Bowser-Rollins 09/24/201508/06/2018

This is an adaptation of an article posted on the Excel Esquire site. This scenario happens often in the middle of a litigation matter. Attorneys will make a list of relevant document numbers and only include the part of the number that is different. But in order for a paralegal or litigation support professional to…

Fast Tip Friday | PDF Files

Fast Tip Friday – Creating a PDF Portfolio

ByAmy Bowser-Rollins 01/07/201608/02/2021

This fast tip demonstrates the basics for creating a PDF Portfolio.

Fast Tip Friday | MS Excel

Fast Tip Friday – Using Excel VLOOKUP to Compare Lists of Bates Numbers

ByAmy Bowser-Rollins 07/28/201608/07/2021

This fast tip demonstrates how to use the VLOOKUP function in Excel to compare two lists of bates numbers and then a couple of ways to visually present the results. In a previous fast tip, I demonstrated an alternative way to use Excel to compare two lists. Download Sample Files

Fast Tip Friday | Screenshots

Fast Tip Friday – Using the Windows Snipping Tool

ByAmy Bowser-Rollins 12/01/201608/17/2021

This fast tip demonstrates how to create screenshots using the built-in Snipping Tool in Windows. In a previous Fast Tip Friday tutorial, I demonstrated how to use the art of rubber banding.

Fast Tip Friday | PDF Files

Fast Tip Friday – Delete First Page of Every PDF File in a Folder

ByAmy Bowser-Rollins 02/05/201608/08/2021

In litigation matters, we deal with a lot of scenarios related to preparing documents for different stages of the discovery process. Many of these documents are in PDF format. There are scenarios where the lead attorney has a justified reason and may instruct a paralegal or litigation support to remove one or more pages from…

Fast Tip Friday | PDF Files

Fast Tip Friday – Create Full-Text Index Across Multiple PDFs

ByAmy Bowser-Rollins 06/19/201508/02/2021

This fast tip will demonstrate how to create a full-text index across folders and subfolders of PDF files. This enables the user to run searches across all of the PDF files at once, including bookmarks and comments if desired. In last week’s Fast Tip Friday, I demonstrated how to use an index to increase search…

4 Comments

mgolab says:

02/17/2017 at 3:07 pm

Wow, that is amazing functionality. I’d go so far as to say this is massive. Although apparently Gen Y would call this hectic.

What is really interesting is that the query editor appears to be very similar to how you get data into and massage in PowerBI.

Another use would be say to do with metadata – for example generating metadata from PDFs (via the commandline supertooltik – XPDF and pdfinfo), this will typically generate 5-10 pieces of metadata for each PDF, and you then have to massage it to get it into a workable format. I could see that using Power Query could streamline this.

Massive / Hectic

Reply
1. Betsy Horn says:
  
  02/17/2017 at 5:06 pm
  
  could you help me with installing pdfinfo on a windows machine? I know it’s off-topic from the Excel Power Query, but we recently were discussing how to extract metadata from pdfs. I’m baffled with the installation process.
  
  Reply
  1. mgolab says:
    
    02/17/2017 at 6:28 pm
    
    Ok sure:
    
    My view is that there are two awesome and open source PDF tools for this kind of task being:
    PDFTK – https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/ – looking at this now, it appears that its a GUI product, we have an old version that is commandline, however I won’t branch out into using PDFTK here – suffice to say that it is also a very useful and powerful tool.
    
    XPDF – http://www.foolabs.com/xpdf/
    Download the relevant (Windows) zip files from: http://www.foolabs.com/xpdf/download.html, and navigate to where you downloaded the zip file – on my Windows10 machine the default download path is C:usersMatthewDownloads
    
    Opening the zip file, I see that it contains a folder called xpdfbin-win-3.04.
    
    Assuming you have full write access on your computer, what I like to do is to have a folder under C:Temp called tools ie the path is C:TempTools, I would then create a new subfolder of tools called XPDF. To keep it simple, lets instead assume that you make a folder called XPDF under C: ie C:XPDF
    
    Open the zip file and extract the contents and navigate to C:XPDF. Upon completion open C:XPDF and you’ll see a subfolder called xpdfbin-win-3.04. That’s too messy for me, so go into that subfolder, select all and cut. Return back to C:XPDF and paste.
    
    Now C:XPDFcontains 10 items being:
    bin32
    bin64
    doc
    xpdfbin-win-3.04
    as well as 6 files without file extensions.
    
    the ones that are important to us are bin32 bin64 and doc. If you are running 32bit Windows then you want bin32, otherwise bin64. Although my machine is 64bit Windows, I’m going to use bin32 (it should work fine).
    
    Go into bin32 folder and you’ll see 9 files that start with pdf and have file extensions of .exe. Each of these files can do things [Pro tip – pdftotext is excellent, as is pdfinfo, we don’t give the others as much love and attention as we probably should].
    
    These are all commandline tools and so are inert in Windows GUI world. That’s ok.
    
    Go back up one directory and then go into doc (ie C:XPDFdoc) and you’ll see that there are 12 files, of which 11 are text files. These are the help / usage / documentation files.
    
    Go ahead and open up pdfinfo.txt. Amongst lots and lots of [at first glance probably cryptic] instructions, what it is simply saying is that the usage of this is:
    pdfinfo [options] [PDF-file]
    
    To keep it simple what it is saying is you need to call upon pdfinfo from the commandline, then you tell it a PDF filename. Easy peasey.
    
    Ok, go back to Windows Explorer and C:XPDF, and create a sub-directory called A. ie C:XPDFA. Go and grab a simple PDF that isn’t encyrpted. Lets pretend that your PDF file is called B.pdf, and copy and paste B.pdf into C:XPDFA ie the A folder contains a single file B.pdf.
    
    Go up one directory to C:XPDF and click in the address bar and type CMD.exe.
    you’ll see that the command prompt launches and is patiently sitting at the location of C:XPDF.
    
    Now its going to get a little bit complex. What we want to do is to call up pdfinfo.exe and get it to look at the file B.pdf. The conundrum is that pdfinfo.exe is in a directory bin32pdfinfo.exe and B.pdf is in a directory AB.pdf.
    
    So, our usage is:
    bin32pdfinfo.exe AB.pdf
    
    I then get returned the metadata on-screen from B.pdf. This is nice, however I need it in a text file. So change the syntax above to:
    bin32pdfinfo.exe AB.pdf >B-metadata.txt
    
    Run that and if you have a look at C:XPDF you’ll see a new file called B-metadata.txt. Open it up and you can see that it has the metadata.
    
    This is all well and good, however its quite tricky to do this for large volumes of files. Not impossible just tricky.
    
    Note – if you weren’t able to launch cmd.exe from Windows explorer, launch it from the start menu – when I do this it goes to C:usersmatthew, what you then want is to type CD….XPDF which is saying change directory, go up one level, go up another level then go into XPDF.
    
    To take it to the next level, you can write a script to do all this so that you don’t need to do it manually. Roughly the steps would be:
    create a directory listing of the PDF files, ideally they should all be in the same directory. If the file names or directories have spaces in their names ie docs for affidavitabc 001.pdf and def 002.pdf, then you need to wrap the directory and file names in double quotes.
    
    Hope this helps.
    Phew – that was hectic.
    
    Reply
    1. Betsy Horn says:
      
      02/22/2017 at 9:26 am
      
      Thank you so much for your help. This is incredibly nice of you!

Similar Posts

4 Comments

Leave a Reply Cancel reply