FAFFind returns duplicated Directories on Windows 7?

Started by IJRobson, June 21, 2013, 02:46:04 AM

Previous topic - Next topic

IJRobson

I have a WinBatch tool that uses FAFOpen() to scan the HDD returning File information.  When run on Windows XP this works fine but on Windows 7 it is returning the same Folders over and over again?

The issue appears to relate to "Application Data" related directories for each User on the Computer.  Each user has Application data folders which FAFOpen reports contains the same files yet when you use Windows Explorer and look at the directories it reports them as empty, 'Access Denied' or that the Directories do not exist?

For Example:
C:\Documents and Settings\FRED\AppData\Local\Application Data\Application Data\Microsoft\Windows\Temporary Internet Files
C:\Documents and Settings\FRED\AppData\Local\Application Data\Application Data\Application Data\Microsoft\Windows\Temporary Internet Files
C:\ProgramData\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Application Data\Documents\My Music\Sample Music\

These directories get longer and longer with another level of \Application Data\ until the Windows reports that the WinBatch Application has stopped responding and kills the application.

Thanks

kdmoyers

Fascinating.  Could you post a really short program that demonstrates the problem? It may seem like a bother, but it helps SO MUCH to figure out what's going on.
The mind is everything; What you think, you become.

IJRobson

Below is part of the Scan Tool relating to the File Scan using FAFOpen.  As this is only part of the Source Code it may reference Variable etc outside the code posted:


Code (winbatch) Select
                 ; Initialize flags for readability
fsHidden = 1   ; Include hidden files
fsSystem = 2   ; Include system files
fsRecurse = 16  ; Look in sub directories
File = FileOpen(DatFile, "WRITE")
Drives = DiskScan(2)

For A = 1 to ItemCount(Drives, @Tab)
DriveID = ItemExtract(A, Drives, @Tab)

; Open a search handle
fsHandle = fafOpen("%DriveID%\", "*.*", fsHidden|fsSystem|fsRecurse)
LastFilePath = ""

; Perform the search

While @TRUE
   sFound = fafFind(fsHandle)
   If sFound == "" Then Break
FPath = FilePAth(SFound)

If LastFilePath <> FPath Then
FileWrite(File, "")
FileWrite(File, "[%FPath%]")
LastFilePath = FPath
EndIf

If FileExist(SFound) Then
FName = FileRoot(SFound)
Extension = StrUpper(FileExtension(SFound))
FileName = StrCat(FName, ".", Extension)

If Extension == "EXE" || Extension == "COM" || Extension == "DLL" Then
FVersion = FileVerInfo(SFound, "", "FileVersion")
Else
FVersion = ""
Endif

Details = StrCat(FileName, "=F|", FileSizeEx(SFound, 1), "|", FileYMDHMS(SFound), "|", FVersion)
EndIf

FileWrite(File, Details)
EndWhile
FafClose(fsHandle)
Next A
FileClose(File)

ChuckC

It looks like perhaps you are traversing junctions or directory symbolic links that have been created in the file system on a Win7 computer.  If one of these types of reparse points exists such that the target location that it redirects to is a directory some where above it, then this is exactly the type of behavior that you would expect to encounter.  The Windows I/O Manager will process up to 32 reparse points during the process of resolving a path and attempting to open a directory or file, and once it traverses 32 reparse points, it will return an error.


IJRobson

OK, that sounds like a theory.

The problem is this is happening when FAFFind() is called normally which implies a problem with the Extender?

I have change my code to look for a recursive path '\application data\application data\' and to ignore the returned files which cuts down the amount of incorrect information being returned but is not fool proof and only masks the problem not solves it.

Thanks

Deana

Quote from: IJRobson on June 21, 2013, 05:45:58 AM
OK, that sounds like a theory.

The problem is this is happening when FAFFind() is called normally which implies a problem with the Extender?

I have change my code to look for a recursive path '\application data\application data\' and to ignore the returned files which cuts down the amount of incorrect information being returned but is not fool proof and only masks the problem not solves it.

Thanks

Can you confirm whether or not any reparse points exist on this system? If so, please provide details. I would like to try to reproduce the problem.
Deana F.
Technical Support
Wilson WindowWare Inc.

td

Quote from: IJRobson on June 21, 2013, 05:45:58 AM
OK, that sounds like a theory.

Better than a theory.  Directories like 'Application Data' are a form of reparse point called a directory junction. Normally,  junctions like Application Data under users\whatever have a 'deny' Everyone ACE for file read access. This means that even administrator should not be able to access the junctions target. The extender has no more privileges than the account that is running it so it shouldn't be able to follow the junction either.  Apparently, the permissions on the junction have been fiddled with and/or you are not running the script with the same privileges you are using to view the folders in the shell.  It is also  likely that  someone or something has set up a circular reparse points.  In other words, you have a reparse point in a junction's target that has a reparse point with the directory junction as the target. 

Quote
The problem is this is happening when FAFFind() is called normally which implies a problem with the Extender?

The extender has been tested with all the flavors of reparse points and there are no known issues with reparse points.  Directory  junctions are presented to the extender by the system as normal directory paths.  It is quite capable of following the path to the depth Chuck indicated as the maximum depth returned by the file system.  When that limit is reached the extender simply moves on.

I set up a test case by removing the 'deny' ACE on 'Application Data' and creating a circular junction.  Naturally, the extender took awhile to perform the search but it did not produce any errors. 

How do you know that your script is failing during a call to FafFind?   

Quote
I have change my code to look for a recursive path '\application data\application data\' and to ignore the returned files which cuts down the amount of incorrect information being returned but is not fool proof and only masks the problem not solves it.

The information is not incorrect from the system perspective.  It is an accurate representation of directory structure as presented by the system to user processes calling win32 API functions.

Since the hanging process error cannot be reproduced we would need more information to be able to track down the cause of that error.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

IJRobson

Deana,

I have not changed any Reparse Points, Windows 7 is a standard installed version of Windows as shipped with the Computer.  I have just installed and the System I have been using for months on Windows XP Machines.

TD,

I have Run this system on four different Windows 7 Computers and all report Multiple 'Application Data' folders during the scans.  All are standard Windows 7 Installs.  Only one of these Computers crashes during the process the rest Work but all reports these Reparse Points.

The System is running as a Windows Service so I am guessing the 'System' Access Rights overrides the ACE Deny settings?

I can't be sure that the FafFind is the command that finally kills the System because Windows 7 just reports that the WinBatch Application is not responding and closes the Application?

The System creates a File listing the Folders and Files found during the Scan.  On a normal Computer this File is normally around 3 to 4MB in size.  On Windows 7 this jumps to 10 to 15 MB.   On this failing Computer it stops after about 9 Hours of running at 234MB and always listing the last Directory as a 'Application Data' Directory.  I have looked at why the size difference and this is due to a Windows Temp Folder that contains 17,384 files which then is reported multiple times by FafFind.

Thanks


ChuckC

Since your script is running as a native NT service, you would have to identify what user account the service is configured to use before you can determine what access rights it is going to have to the file system.  If it turns out that the service is running as local system, then it is going to have a level of access that is greater than what members of the Administrators group have, and the restrictive deny permissions that are present on some of the junctions under user profiles will not prevent the script from traversing those junctions.

With that said, it is almost a certainty that there is, in fact, a recursive junction or directory symbolic link present in the path on the affected computer.  If you want to investigate this further, use the "psexec" utility in the SysInternals Suite to run an instance of "cmd.exe" as local system, then use the CD and DIR commands to interrogate the file system contents under the user profile in question.


td

Note that you would need to use "dir /a" to see hidden directory junctions.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

IJRobson

Thanks for the information.

The NT Service is running as a "Local System" so I think that explains why it is traversing these folders.  Running as Local System does allow me access to Directories that are normally "Access Denied".

It looks like Application Data \ Temporary Folders are interlinked?  If I look at the file list returned from each of the C:\Users\{User Name}\ directories they contain the same list of files so they must all be pointing to the same location of the HDD.  I have confirmed this by deleting some files from one User Area and they disappear from the file list returned from another User's Area?

This also explains why I am getting so many files returned. 8 different User Directories each with Multiple \Application Data\ references returning 17,000+ files in each \Application Data\ directory.

How or why all these junctions point to the same area is unknown but it appears to happen on all Windows 7 installation I have run the system on (four so far).  This could be Windows or Application related?  Tracing the 17,000+ files most relate to Internet Explorer so that may have modified something?

What if I removed the Hidden and System Flags from the FAFOpen is that likely to stop it traversing these normally "Access Denied" Directories?

ChuckC

The attributes mask RASH is something that exists on all objects in the file system regardless of whether they are files, directories or reparse points.  Given this, excluding hidden & system files from your search may result in not finding regular directories and files that are actually of interest.

For Windows Vista & newer, the usage of junctions and directory symbolic links within user profiles is a common occurrence.  What you are observing is completely normal and nothing is broken within the file system.  However, the software that is interacting with the file system has not been updated to be aware of reparse points and thus it is handling them incorrectly.

In my day job, I am responsible for the development & maintenance of a commercial software product that scans file systems across the entire enterprise and reports on the contents that is discovered.  In this software, which runs across Window and Linux, we always stop recursion at reparse points and directory symbolic links in order to avoid duplicate results in the collected data and to prevent recursion problems due to circular links in the file system.

The proper thing to do would be to have FAFFind be modified to support a new flag that causes it to report the existence of reparse points but to not recurse into them.  Alternatively, you can write your own recursive file system scanning routine that replaces FAFFind and which handles reparse points correctly.

IJRobson

Thanks ChuckC that is useful information.

So how do you know if a Directory Entry is a reparse Point?

As the existing FAFFind command basically works what I need is something to tell me when I have reached a parse point and then I can ignore any entries returned by FAFFind as it traverses the Directory Tree from that Point.

Do you know of a call either in WinBatch or Windows that can report if a Directory is a reparse point?

Thanks

td

Quote from: ChuckC on June 24, 2013, 05:11:18 AM
The attributes mask RASH is something that exists on all objects in the file system regardless of whether they are files, directories or reparse points.  Given this, excluding hidden & system files from your search may result in not finding regular directories and files that are actually of interest.

For Windows Vista & newer, the usage of junctions and directory symbolic links within user profiles is a common occurrence.  What you are observing is completely normal and nothing is broken within the file system.  However, the software that is interacting with the file system has not been updated to be aware of reparse points and thus it is handling them incorrectly.

In my day job, I am responsible for the development & maintenance of a commercial software product that scans file systems across the entire enterprise and reports on the contents that is discovered.  In this software, which runs across Window and Linux, we always stop recursion at reparse points and directory symbolic links in order to avoid duplicate results in the collected data and to prevent recursion problems due to circular links in the file system.

The proper thing to do would be to have FAFFind be modified to support a new flag that causes it to report the existence of reparse points but to not recurse into them.  Alternatively, you can write your own recursive file system scanning routine that replaces FAFFind and which handles reparse points correctly.

Reparse points have been around and used for more than 25 years on Windows and the extender has been around for 5 years.  No users has ever requested we change how reparse points are handled.  This suggests that circular reparse point are rare for extender users and/or extender users are not concerned about the extender following circular references.  Projecting ones preferences unto an entire user group is seldom a good software development practice.  This is particularly so when the desire is to keep things simple when possible.

There is a long standing bullet on the extender's enhancement list to add an ignore flag for reparse points but there has been no interest until now.  Report but don't traverse is something that has not been considered until now.   
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Quote from: IJRobson on June 24, 2013, 07:12:45 AM
Thanks ChuckC that is useful information.

So how do you know if a Directory Entry is a reparse Point?

As the existing FAFFind command basically works what I need is something to tell me when I have reached a parse point and then I can ignore any entries returned by FAFFind as it traverses the Directory Tree from that Point.

Do you know of a call either in WinBatch or Windows that can report if a Directory is a reparse point?

Thanks

Here is one approach.  There may be better ones.  You would need to modify it according to your needs, of course.

Code (winbatch) Select

FILE_ATTRIBUTE_REPARSE_POINT  = 1024
hKernel32 = DllLoad("kernel32")

strDir= "C:\Users\username\Application Data"
if DllCall(hKernel32, long:"GetFileAttributesW", lpwstr:strDir) & FILE_ATTRIBUTE_REPARSE_POINT then strText = "is"
else strText = "is not"

DllFree(hKernel32)

Message( strDir, strText:" a reparse point")

"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

IJRobson

TD,

Thanks for clarification on the Extender implementation.

There is something different between the Windows 7 setup and older Windows as I have never noticed this problem when running the same WinBatch code on older Windows.  So either the way Microsoft is using reparse points has changed or the default reparse point locations / pointers are linking to more / recursive points?

I do agree that most users would not notice duplicated files caused by the FAFFind traversing the reparse points.  I only noticed because it is so much slower to run the same code on Windows 7 and the output was reporting 20 times more files than are actually on the HDD! 

I also agree I am probably compounding the problem by running as a Local System User.  But again this works fine on Windows NT / XP only Windows 7 produces the recursive \Application Data\ directories.

Anyway, thanks for the method to identify a reparse point.  I will implement this in may code so I know which directories to ignore the file information from.

Thanks

kdmoyers

((Thanks guys for what has been the most interesting thread on the new forum to date! -Kirby))
The mind is everything; What you think, you become.

IJRobson

I have run the new version of the code on a Windows XP Machine and it only reports four reparse points and non of these are User related:
        C:\WINDOWS\assembly\GAC_32\System.EnterpriseServices\2.0.0.0__xxxxxxxxxxxxxxxxx\
        C:\WINDOWS\assembly\GAC_MSIL\IEExecRemote\2.0.0.0__xxxxxxxxxxxxxxxxx\
        C:\WINDOWS\Microsoft.NET\assembly\GAC_32\System.EnterpriseServices\v4.0_4.0.0.0__xxxxxxxxxxxxxxxxx\
        C:\WINDOWS\Microsoft.NET\assembly\GAC_MSIL\Microsoft.Workflow.Compiler\v4.0_4.0.0.0__xxxxxxxxxxxxxxxxx\

I am running the same code on a Windows 7 machine I will post the outcome from that scan once it has finished.

IJRobson

I also run it on Windows 7 and firstly this change has reduced the scan time from over 9 hours to just under and hour and the File List reports 135 separate Reparse Points most relating to the User Area.

So that explains why I am seeing a difference on Windows 7.

I will run it on some more Windows 7 Machines tomorrow and see if I get the same results.

td

Quote from: IJRobson on June 24, 2013, 12:07:12 PM
I also run it on Windows 7 and firstly this change has reduced the scan time from over 9 hours to just under and hour and the File List reports 135 separate Reparse Points most relating to the User Area.

So that explains why I am seeing a difference on Windows 7.

I will run it on some more Windows 7 Machines tomorrow and see if I get the same results.

If you were to run your code on my old XP system you would find many more. That is primary because of the software I use on that system and also because it has multiple volumes mapped to various and sundry directories to make my life easier (in theory.)

You would also be more likely to find reparse points on a Windows 2003 server because they are more likely to have multiple volumes, disk, e.t.c.

As Chuck pointed out on Win 7 (and Vista) MSFT added junctions to user directories for the old XP user file locations.  I guess the idea is to help older XP applications work better on Vista/Win 7 but I am not sure how helpful it realy is given the default security settings shipped with the system. 

If you are executing you script and reporting results just to convince us to add an enhancement to the FAF extender,  we appreciate the effort but it is not necessary.  Vista/Windows 7 have been around for quite awhile and the default locations of junctions and links is well known on those systems.  And since someone finally, in so many words, requested some kind of reparse point flag, there is a good chance that something along those lines will be added to the extender in a future release.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

IJRobson

Thanks for the reply.

These last two responses have just been observations to round off this topic.  I now have a working solution to my problem and I have also learned something about Reparse Points  which is something I had never come across before.

Thanks to everyone for there help and information, it has been useful.