Results 1 to 11 of 11
  1. #1
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Altered name based on contents of field and RFC 2396 (URI Generic Syntax) - % escape
    Hi,

    Top-level problem.
    WebMerge alters data used as file names to be suitable for use on a Web server. Characters such as quotes, spaces, question marks, forward slashes, and others have special significance on a server, and such characters are converted to hyphen ("-") when the page is generated.
    This means that car “%” is altered (converted to hyphen ("-")).

    As in RFC 1738: Uniform Resource Locators (URL) specification,
    "%" is a reserved character because we use it to URL safe encode/escape other characters and Sir Timothy J. Berners-Lee (creator of the World Wide Web and director of the World Wide Web Consortium W3C ) wrote in RFC 2396 (URI Generic Syntax) :

    each 'original character' is represented as the octet for the US-ASCII code for it, which is, in turn, represented as either the US-ASCII character, or else the "%" escape sequence for that octet
    In my works, the names of the URLs for the file names are in a field and are safe encoded using "%" escape sequence when needed so, please, add something anywhere to avoid altering if necessary: I think at an "Alter"/"Don't alter" switch in the "Generated File Names" group of the "Detail pages" tab.

    As usual, I am in a hurry and so on…

    Otherwise, what a good product!

    Thanks

  2. #2
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Hi all and WebMerge

    I still have the problem.

    This is crucial for me as I am working on the dark side of the Internet and its dirty tricks. URLs must be exactly what I say. How can we ensure that Webmerge just strictly adhere to the contents of fields used for the URL?

    Example
    A url with
    [path] œuf þý ®©
    is prepared, in my datafeed, as
    %5Bpath%5D%20%C5%93uf%20%C3%BE%C3%BD%20%C2%AE%C2%A9
    but webmerge make it as
    -5Bpath-5D-20-C5-93uf-20-C3-BE-C3-BD-20-C2-AE-C2-A9
    This no longer means anything

    Simply add something like
    URL contain escape code [yes/no]
    and avoid altering escape codes if [Yes]

    Simply something that already exist for the other fields (like the "raw" attribute but in the Detail Pages Tab).

    Thanks a lot for this great product
    Last edited by Pierre (aka Terdef); March 15th, 2008 at 04:22 AM.

  3. #3
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Hi,

    In a post ( http://forum.abestweb.com/showthread.php?t=46437 ), Richard Gaskin ask about escape sequence:

    If you have a URL to an authoritative discussion on that it would be greatly appreciated
    That was on December 23rd, 2003

    Look at the first post in this thread.

    It would be so good that WM simply respect Uniform Resource Locators (URL) specification (witch is not UNIX specification but World-Wide Web specification in RFC 1738).

    Here is the specification (from Tim Berners-Lee himself (creator of the World Wide Web and director of the World Wide Web Consortium W3C ) - World-Wide Web project - CERN)

    http://www.faqs.org/rfcs/rfc1738.html

    It is from December 1994 and is unchanged and active.

    "This document specifies a Uniform Resource Locator (URL), the syntax and semantics of formalized information for location and access of resources via the Internet."

    If, in the “Detail Pages Tab” of WebMerge, I say "Use this field for URLs", do not touch at what that field contain – I know what I am doing.

    % has a special meaning in an URL – it is a reserved character - do not change escape sequence - they are the norm - the fundamental of the Internet.

    Possibly, ask if WebMerge must not comply with what is written but by default abide by the contents.

    Your tool is absolutely fantastic. He just suffers of that microscopic problem witch, for me, takes immense proportions.

    Can you do something and when?

    Thanks a lot

  4. #4
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Quote Originally Posted by Pierre (aka Terdef)
    Can you do something and when?
    Yes and 10 minutes ago. Bugs always get a very high priority here.

    While the program had indeed been very conservative with allowed characters in file names, I agree that we need to adhere to specs as closely as possible. If someone includes "%" characters in the data they want to use as file names, they need to know what they're doing or enjoy the unpredictable results.

    I've just posted two new builds for you to see if I've nailed this down, one for Mac and one for Win:

    Windows:
    http://www.fourthworld.com/products/...e243b5.exe.zip

    Mac:
    http://www.fourthworld.com/products/...e243b5.app.zip

    Note that these files are to the application only. To run, just download them, unzip them, the put the unzipped app into your existing WebMerge program folder, and double-click to run.

    Let me know if this continues to give you any trouble with this new build.

    I've also put the fix in the v2.5 code base, so as we get ready to test that version soon we should see the issue gone from there as well.

    Thanks for the report!
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  5. #5
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Yeeeeeeeeesssssssss !

    I am in the process of conducting tests. Wm side, this seems totally OK.

    It appears now that I have problems server side with my website hosting. In FTP, it is ok in both directions but in http, the server refuse to serve the pages. I am now looking at this.

    Thanks a lot.

    ps : What is the change log between 24 and 2.43 (and future 2.5) ?

  6. #6
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    Glad the revision worked out.

    Re. changelog: 2.4.2 fixed some minor issues with Mac file names.

    Version 2.5 introduced a much-requested change to the way WebMerge determines when to generate a new index page. Currently, you can have WM make a new index page when the value in a field change OR when a specified number of records is reached. In v2.5 you can apply both options.

    Other changes in v2.5 include optimizations for Vista and Intel Macs, introduces our first Linux version, and will include a few lesser enhancements.
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  7. #7
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Hi Richard

    I wonder if there is not a little confusion between URLs and URIs.

    When I have a field with something to use for the URI (the file name), DO NOT TOUCH AT ANYTHING in this field and pick it as it is.

    I've made a work for WebMerge with the first 256 char. It is a large table. Noted in this table are the chars forbidden in URIs using Win, Mc OS, Mc OSX, Dos, Unix, Linux and BSD.
    You can view it at

    Table ascii unicode escape

    Column URI
    Forbidden chars in URIs (aka Files names), all operating systems combined. I think you do not have to verify this but you may, if you want, and replace those forbidden chars, and only those ones, on the fly, by a "_" (underscore). All others chars are allowed in URIs - do not change anything.

    Column URL
    URLs (the ones you made in Index pages)
    Here is the place to use safe encoding using escape code
    Only three chars found in URIs must be safe encoded in URLs to avoid ambiguities (other ones will be safe encoded automatically by browsers).
    % -> %25 (this substitution made first)
    [Space] -> %20
    [Non breakable space] -> %A0

    I have also made a datafeed with
    1 char per line
    the URI (expected to be intact and respected by WM)
    the URL expected in the index

    This datafeed is an Excel datasheet
    http://assiste.com.free.fr/ftp/tests...afe_encode.xls

    Have fun with.

    Thanks a lot.

  8. #8
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    What about "/"?

    And if WebMerge is used to generated web pages, is there a practical difference between URLs and URIs?
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  9. #9
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Hi

    "/" is forbidden in a "file name" under Windows, Mc OSX and Unix
    It is Permitted but not recommended under BSD

    The URI stand here for the "file name" under an operating system
    The URL will be the link to the "file name" under a server operating system





    @+

  10. #10
    Affiliate Manager
    Join Date
    January 18th, 2005
    Location
    Los Angeles, California
    Posts
    1,913
    On BSD (and other Unix systems), "/" denotes a path delimiter. How can it be used in a file name?

    The detailed info you posted is appreciated (always good to see another GoLive user; sure wish Adobe never acquired that product, it was a nice one when it started out), but what specific changes are you suggesting for WebMerge?
    Richard Gaskin
    Developer of WebMerge: Publish any data feed on any site
    http://www.fourthworld.com

  11. #11
    Newbie
    Join Date
    June 22nd, 2005
    Location
    France
    Posts
    33
    Hi,

    Well,

    What there is in the field used for the name of the page, use the content as it stands, without any modification for the file name.

    However, for the link in the index pages, safe encode some chars.

    If the field designed by "Name based on content of field" on "Details pages" says
    " !$%&'()+,-." (the first char is a space and the last a dot)
    and the suffix is html
    generated file name must simply be " !$%&'()+,-..html"

    but, in index pages, the link to this page must be
    <a href="%20!$%25&'()+,-..html">blah blah</a>

    Normally, safe encoding (escape code) is optional and is made by the browsers but there is ambiguity for "%" so it is quite the only one to safe encode (with, for good practice ) :.
    Space -> %20
    [ -> %5B
    ] -> %5D
    ^ -> %5E
    ` -> %60
    { -> %7B
    } -> %7D
    All others from HA0 to HFF

    IE :
    Hex range - File name - Safe encode link

    H20 to H2F - !$%&'()+,-..html - <a href="%20!$%25&'()+,-..html">blah blah</a>
    H30 to H3F - 0123456789;=.html - <a href="0123456789;=.html">blah blah</a>
    H40 to H4F - @ABCDEFGHIJKLMNO.html - <a href="@ABCDEFGHIJKLMNO.html">blah blah</a>
    H50 to H5F - PQRSTUVWXYZ[]^_.html - <a href="PQRSTUVWXYZ%5B%5D%5E_.html">blah blah</a>
    H60 to H6F - `abcdefghijklmno.html - <a href="%60abcdefghijklmno.html">blah blah</a>
    H70 to H7F - pqrstuvwxyz{}~ - <a href="pqrstuvwxyz%7B%7D~.html">blah blah</a>
    HA0 to HAF - ¡¢£¤¥§¨©ª«¯®¯.html - <a href="%A1%A2%A3%A4%A5%A7%A8%A9%AA%AB%AC%AF%AE%AF.html">blah blah</a>

    ASO - see http://assiste.com.free.fr/p/faq_webmaster/!_faq_webmasters.html

    That's all

    ps :
    Golive is the best !
    Right : On BSD, "/" denotes a path delimiter and can not be used in a file name.

    @+

  12. Newsletter Signup

+ Reply to Thread

Similar Threads

  1. Replies: 0
    Last Post: September 1st, 2008, 04:43 PM
  2. Problem with creating folders based on field name
    By romulus in forum WebMerge (Fourthworld.com)
    Replies: 1
    Last Post: October 7th, 2006, 01:17 AM
  3. Name based on contents field - multiple index pages
    By T.N.T. in forum WebMerge (Fourthworld.com)
    Replies: 10
    Last Post: May 24th, 2004, 04:50 PM
  4. Best Affiliate Programme for contents Based sites
    By awtinfo in forum ShareASale - SAS
    Replies: 1
    Last Post: May 12th, 2004, 02:15 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •