This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Repost, different list...File::Spec, Cygwin, Syntactic vs. Semantic path analysis


This was originally sent to cygwin and module authors list, but since File::Spec
is part of core perl, it was suggested I move it to the perl5-porters list,
though it's not really 'just' a porting issue, since it also involves the
issue of how File::Spec should be _defined_ to behave (syntactic analysis
in absence of checking with a local fs, since there is no guarantee that some
paths are constructed for the current system, exceptions being routines like
'pwd'.

Anyway...here it is (again) -- sorry for the duplication to the cygwin
list, but cygwin folks are just special, ya know? :-)
-l
----

A bit late to the party, I know, but wanted to chime in on the Cygwin File::Spec discussion.  I'm 'cc'ing the cygwin list as a "heads up" for any interested parties.

A more satisfactory mapping is to base "Cygwin" on Win32, not Unix.

Cygwin, as an "OS interface" _partially_ supports posix mapping -- it supports posix naming to the same extent the underlying Win32 OS supports it -- not to the level Unix supports it.

For example, 
1) "\:" are illegal in Cygwin pathnames as they are under Win32.  Posix doesn't have this difference.

2) By default, case doesn't matter under Cygwin in the cases where it doesn't matter under Win32.  It's not _ignored_ -- case is preserved on filename creation but not lookup.  Ex:

law/proj/fspec> touch aBcDe
law/proj/fspec/tmp> ls
aBcDe							(expected)
law/proj/fspec/tmp> mv ABCDE AbCdE		(this works!)
law/proj/fspec/tmp> ls
AbCdE							(case is changed)
law/proj/fspec/tmp> touch aBcDe		(try to recreate original filename)
law/proj/fspec/tmp> ls
AbCdE							(win32 behavior, not posix behavior)

3) using unix and win32 syntaxes to parse valid cygwin filenames "i:/fee/fie/foe/fum" and "i:fee/fie/foe/fum" yield (under 5.8):

unix:v,d,f=, i:/fee/fie/foe/, fum
unix: 'i:', 'fee', 'fie', 'foe', ''
Win32:v,d,f=i:, /fee/fie/foe/, fum
Win32:'', 'fee', 'fie', 'foe', ''
    and
unix:v,d,f=, i:fee/fie/foe/, fum
unix: 'i:fee', 'fie', 'foe', ''
Win32:v,d,f=i:, fee/fie/foe/, fum
Win32:'fee', 'fie', 'foe', ''

	Under cygwin, you cannot create a filename "i:fee" -- it creates a filename 'fee' (somewhere**) on the "i:" filesystem (volume). 

====
=>	For some reason that eludes me, the current File::Spec implementation
=> returns a 'null' directory as the last directory component while no such => component existed in the original pathname.  I would tend to think this is a => bug. Elucidating comments?  Agreement?  Disagreement? ====

	Cygwin, and possibly, the Win32 module, are inconsistent in handling the differences between i:/foobar/ and i:.  On one hand i: is considered a 'volume' but on the other hand i:/ seems to evaluate to the same, incorrect, value. In "Win32", each 'fs' of form "<x>:', x of class <[:alpha:]>, there is a process-specific "current directory".  This can be seen by:

>From cmd.exe, cygwin utils "ls", "pwd", "printenv" and "grep" are in my path.

1) Fresh cmd shell, what's on filesystem "Z"?

C:\Documents and Settings\law>dir /b /a z:## /a=show hidden, /b=skip header desktop.ini
Content.IE5						## expected output, /a=show hidden files
							
C:\Documents and Settings\law>ls z:
Content.IE5  desktop.ini			## again, expected output...
							
2) Where are we?

C:\Documents and Settings\law>cd
C:\Documents and Settings\law			## expected -- shows same as the
							## default prompt:
C:\Documents and Settings\law>echo %prompt%
$P$G							## $p=cur drive and path, $g=">"
C:\Documents and Settings\law>pwd		## on cygwin?
/cygdrive/c/Documents and Settings/law	## hmmm...cygwin translated c: to
							## /cygdrive/c;  seems like under
							## cygwin, /cygdrive/c is a win32 volume.
3. What does File::Spec say about that?
ishtar:law/proj/fspec> fspec "c:Documents and Settings/law/" Win32:v,d,f=c:, Documents and Settings/law/, 
Win32:'Documents and Settings', 'law', ''	## expected, C: is a fs (volume)
cygwin: v,d,f=, c:Documents and Settings/law/, 
cygwin: 'c:Documents and Settings', 'law', ''
							## oops, seems File::Spec::Cygwin
							## doesn't act the same way the Cygwin
							## layer does...>>not good<<, bug?

4. How about "/cygwin/c/Documents and Settings/law" ?
cygwin: v,d,f=, /cygwin/c/Documents and Settings/law/, 
cygwin: '', 'cygwin', 'c', 'Documents and Settings', 'law', '' unix:v,d,f=, /cygwin/c/Documents and Settings/law/, 
unix: '', 'cygwin', 'c', 'Documents and Settings', 'law', '' Win32:v,d,f=, /cygwin/c/Documents and Settings/law/, 
Win32:'', 'cygwin', 'c', 'Documents and Settings', 'law', ''

							## All the same...good, or is it?...um
5.
							## under cygwin, /cygwin/x/ can't be used
							## as a filename:
law/proj/fspec> mkdir -p /cygdrive/e/foo
mkdir: cannot create directory `/cygdrive/e': No such file or directory
							## looks like /cygdrive/<x> is just
							## as reserved as '<x>:' -- it's
							## a <fs> spec

oops...under cygwin, /cygdrive/x/
							## always specifies a volume yet
							## File::Spec doesn't seem to know
							## about this distinction -- oh well,
							## we can use rule if win32 elements
							## present, 
6. How about remote file systems?
ishtar:law/proj/fspec> fspec '\\fee\fie\foe\fum'
argv#=0
cygwin: v,d,f=, , \\fee\fie\foe\fum
cygwin: 
unix:v,d,f=, , \\fee\fie\foe\fum
unix: 
Win32:v,d,f=\\fee\fie, \foe\, fum
Win32:'', 'foe', ''
							## hmmm. cygwin thinks it is somehow
							## different than the Win32 parsing.
							## But is is not.  As stated above,
							## "\" isn't a valid filename character
							## under cygwin.  The correct parsing
							## is the Win32 version -- since "\\"
							## always starts a hostname under
							## cygwin and win32.  The "component"
							## is the remote public share name.

7. What about 'other drives'							
C:\Documents and Settings\law>z:
Z:\							## lets goto z...
Z:\>pwd						## prompt indicates Win32 curdir
/cygdrive/z						## cygwin...consistent

8. Environmental factors...(back to 'c')
C:\Documents and Settings\law>printenv |grep Z		## nothing in env for Z
C:\Documents and Settings\law>cd z:Content.IE5
							## sets curdir in env:
C:\Documents and Settings\law>printenv |grep Z
!Z:=Z:\Content.IE5				## it's there now...

9. Current dir check:
C:\Documents and Settings\law>pwd		## win32 prompt
/cygdrive/c/Documents and Settings/law	## cygwin -- expected

10: What's on Z again ...
C:\Documents and Settings\law>dir /b /a z:
index.dat
desktop.ini
5OG8E8Y3
RRJT4N16
30F1BLRO
UYYGJ85V						## Ahh...the Win32 separate curdir
							## concept.

C:\Documents and Settings\law>ls z:
Content.IE5  desktop.ini			## uhoh, cygwin's ignoring the underlying							## OS concepts, this doesn't bode well
11. Lets switch to our 'Z' drive again'
C:\Documents and Settings\law>z:
Z:\Content.IE5>					## shows up win32's per-drive curdir
							## concept
Z:\Content.IE5>pwd
/cygdrive/z/Content.IE5				## pwd provides consistent feedback,

12. Commands: mkdir "x:c", <ls/dir> x:", cd "x:c", "<ls/dir> x:", 
	assuming x is a drive for win32/cygwin examples:
	on Unix - 2 errors and ending up in dir "x:c"
	On Win, - no errors and same output -- no change in current directory 
on current drive.
	On cygwin - output of the of the two dirs is different and the current directory is changed.

	Cygwin isn't really behaving like Unix or Win32.

	So I see a couple of problems: 
1) File::Spec should mainly be based on Win32, if not exactly the same.
2) Cygwin should pay attention to the Win32 concept of per-drive pathnames
	when win32 drive letters are used (though /cygdrive/c should still refer to the root dir of the 'c' drive).
3) File::Spec needs cleanup.  Is it supposed to be parsing "Syntactically" or "Semantically".  They are different.
4) File::Spec::Win32 would give incorrect results on one of my old Samba exported fs's: server exported people's home directories as "/home/<user>" to prevent Win98 from automatically using \home\<user> as a location for a traveling profile.

Syntactic parsing would yield a volume/fs of would yield //server/home and 
file=<user> but the exported fs name was really "/home/user", and should have been parsed as one volume name: //server/home/user, or in DOS-syntax, "\\server\home/user".

Syntactically, this can't be anticipated or interpreted and the use of a simple, documented limitation -- the assumption of non-intermixing of \ and / as pathname component separators in the same pathname would be used.  So the first "/" sets the dir sep to "/" and "\" could signal a warning that the syntax is unclear.  But a pathname with "\" as the first dir sep, would throw an illegal-filename exception if "/" was encountered, because in places where '\' is a 
dirsep, '/' is a switch character.

Refering to problem 3, above, Semantically, under Unix as in other OS's, 
separate filesystems should be parsed out as separate volumes -- since the concept of a 'volume' in OS terminology is most often used to describe a filesystem.  Older usage, I believe, used the term interchangeably with 'disk', but in modern usage (influenced most commonly by Win32), it is a file system.

The Win32 module, semantically, also, for consistency, has to recognize the same problem in later NT-based OS's since 'volumes' can be mounted on arbitrary mount points in the fs-hierarchy.  This underscores the concept of of a 'volume' as a mountable fs -- a definition that, semantically, would apply to Unix as well.  But, vaguely, it appears the intent of File::Spec was to provide OS-blind ways of manipulating arbitrary filenames (not necessarily limited to valid filenames on the current system).  As such, default manipulation routines should consistently be altered to be 'syntax-only' (except where absolutely
necessary: ex. 'pwd' function).

I'm not decided on the best syntax to provide an additional layer that does semantic analysis based on what is true on the current system.

Comments?  No flames, please...we don't need to get personal or religious on what should be an engineering discussion (in which I may have misconceptions, but may just be trying to look at the problem space from a different perspective.

thanks,
-linda



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]