X-Cart: shopping cart software

X-Cart forums (https://forum.x-cart.com/index.php)
-   Dev Questions (https://forum.x-cart.com/forumdisplay.php?f=20)
-   -   HTML Catalog Cleaner - Removes excess white space (https://forum.x-cart.com/showthread.php?t=8260)

NuAlpha 06-26-2004 04:49 PM

HTML Catalog Cleaner - Removes excess white space
 
This script acts like a one time {strip} tag, removing all unnecessary white spaces from your .html files generated by the HTML Catalog, thus reducing the bandwidth needed for these files.

At the moment, I don't have a .tpl interface written for it, so you have to call it from your browser. This works on your existing catalog, though I am sure you could pull out the regex code and stick it into admin/html_catalog.php if your so inclined.

Here is the code. Put it in whatever directory you wish under whatever name you wish (.php extension of course)...
Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>HTML Catalog Cleaner</title>
</head>
<body>
<?php
######################################################
##            ## HTML Catalog Cleaner ##            ##
######################################################
##                                                      ##
## Strips every file in the HTML catalog directory  ##
## of all excess white spaces.                      ##
######################################################
## Version: 1.0.0 (6/26/2004)
## Last updated: 1.0.4 6/26/2004

# Function to Clean-up #
########################
function script_shutdown() {
        ini_set('max_execution_time', '30'); // Reset the maximum execution time.
        ini_set('max_input_time', '60'); // Reset the maximum input time.
        ob_implicit_flush(0); // Data should be kept in the buffer until ready.
}

register_shutdown_function('script_shutdown'); // Register the shutdown function.

# Modify PHP Settings #
#######################
ini_set('max_execution_time', '14400'); // Make the maximum execution/input time 4 hours so that the script doesn't time-out.
ini_set('max_input_time', '14400');
ini_set('zlib.output_compression_level', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
ob_implicit_flush(1); // Show the progress in the browser.

echo "Stripping the HTML files of excess spaces...
";

# Initialize variables.
$dir = '/home/your-site-dir/path-to-xcart/catalog'; // Set the absolute directory path.
$successes = 0;
$failures = 0;
$filesize['init'] = 0;
$filesize['final'] = 0;
$cnt['tmp'] = 0; // Newline counter
$cnt['tot'] = 0; // Totals counter
$pblr = 0; // Progress bar length reducer

# Initialize settings.
$regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs
                          ' '=>'/ +/', // Excess spaces
                          ''=>'/ /i', // Additional space after non-breaking space
                          '><'=>'/> </' // Space between HTML tags
                          );

# Open the directory and store the file list.
if (is_dir($dir)) {
        if ($dh = opendir($dir)) {
                # Iterate over file list.
                while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files.
                        if (strpos($filename,'.htm') !== false)
                                $file_list[] = $filename;
                }
                closedir($dh); // Close the directory.
        }
       
        # Perform specific operations on the files.
        foreach ($file_list as $file) {
                if (($fs_tmp = filesize($dir.DIRECTORY_SEPARATOR.$file)) !== false)
                        $filesize['init'] += $fs_tmp;
                $file_contents = file_get_contents($dir.DIRECTORY_SEPARATOR.$file);
                foreach ($regex as $replace=>$finds) // Do each replacement.
                        $file_contents = preg_replace($finds,$replace,$file_contents);
                $fp = fopen($dir.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications.
                if (!fwrite($fp,$file_contents)) {
                        $failure_list[] = $file; // Log failures.
                        $failures++;
                } else {
                        $successes++;
                }
                fclose($fp);
                if (($fs_tmp = filesize($dir.DIRECTORY_SEPARATOR.$file)) !== false)
                        $filesize['final'] += $fs_tmp;
                if ($pblr == 5) { // Modify this number to reduce the length of progress bar in case of a lot of files.
                        echo '|'; // Show progess line.
                        $cnt['tmp']++; // Increment the newline counter.
                        $pblr = 0; // Reset pblr counter.
                } else {
                        $pblr++;
                }
                $cnt['tot']++; // Increment totals counter.
                if ($cnt['tmp']==300) { echo '
'; $cnt['tmp']=0; } // Reset the counter.
                }
} else
        die(''.$dir.' is not a directory!');

echo '

There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>';
echo '

Your HTML Catalog files initially occupied '.number_format($filesize['init']).' bytes of disk space.';
echo '
They now occupy '.number_format($filesize['final']).' bytes of disk space.</p>';

if (isset($failure_list)) {
        echo '
The following files could not be written:
';
        foreach ($failure_list as $fail)
                echo ''.$fail.'
';
}
?>
</body>
</html>


Let me know if you find any problems! :)

adpboss 06-26-2004 10:28 PM

Code:

# Initialize variables.
$dir = '/home/your-site-dir/path-to-xcart/catalog'; // Set the absolute

Don't forget to set the path to your html catalog!

I'm testing now! :-)

adpboss 06-26-2004 10:34 PM

The code is cleaned up although the reported initial and finished file size for the work is the same.

Pages are lightning quick.

This is a SICK SICK mod. I love it.

I just added it as a link under my HTML Catalog link in the menu admin.

skin1/admin/menu_admin.tpl

NuAlpha 06-26-2004 10:44 PM

Quote:

Originally Posted by adpboss
The code is cleaned up although the reported initial and finished file size for the work is the same.


I had already done the once over with this mod on our catalog, when at the last second I decided to add the 'filesize difference' code before posting so I never even saw the output. :lol:

I am suprised that there was no difference in filesize. :? I wonder why not....hmmm. :?:

NuAlpha 06-26-2004 10:49 PM

By the way adpboss, did you make use of those regular expressions I posted for adding {literal} tags to every space and newline in your plain text emails? I have not gotten a chance to actually test it, but I will be doing so soon as I need plain text labels to print.

Am thinking about writing a customizable mod to allow printing on sheets of Avery style labels.

adpboss 06-26-2004 11:19 PM

Found a bug, it kills this javascript that I have in my product pages that pops up a new window for detailed images.

So I can't use it as-is, but hopefully I'll find time to play with this and get it to work. :(

Is there a way to PREVENT areas of code from being touched?

Code:

{literal}
<SCRIPT language=JavaScript1.2>
<!--
var store_language='{/literal}{$store_language}{literal}';

function product_option(name_of_option)
{
{/literal}
for(i=0; i<{$product_options_count|default:"0"}; i++)
  if (document.orderform[i].name.search(name_of_option) != -1)
        return document.orderform[i];
return -1;
{literal}
}

function FormValidation()

{/literal}
{if $javascript_code}
{$javascript_code}
{else}
return true;
{/if}
{literal}
}
-->
</SCRIPT>


NuAlpha 06-27-2004 07:16 AM

Hmmm...I will have to think about this one. Forgot about the javascript. :(

NuAlpha 06-27-2004 09:58 AM

Here is the revised version. This should take care of the javascript problem. :)
Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>HTML Catalog Cleaner</title>
</head>
<body>
<?php
######################################################
##            ## HTML Catalog Cleaner ##            ##
######################################################
##                                                  ##
## Strips every file in the HTML catalog directory  ##
## of all excess white spaces.                      ##
######################################################
## Version: 1.1.0 (6/26/2004)
## Last updated: 6/27/2004

# Define the Constants #
########################
define('CATALOG_DIR', '/home/nulimec/public_html/catalog'); // Set the absolute directory path to your catalog.
define('BAR_LENGTH_REDUCER', 3); // If you have over 1000 HTML files in your catalog, you may wish to set this number higher.
# System constants.
define('MAX_ET', ini_get('max_execution_time'));
define('MAX_IT', ini_get('max_input_time'));

# Initialize variables.
$successes = 0;
$failures = 0;
$filelength['init'] = 0;
$filelength['final'] = 0;
$cnt['tmp'] = 0; // Newline counter
$cnt['tot'] = 0; // Totals counter
$pblr = 0; // Progress bar length reducer variable

# Initialize regular expressions.
$regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs
                          ' '=>'/ +/', // Excess spaces
                          ''=>'/ /i', // Additional space after non-breaking space
                          '><'=>'/> </' // Space between HTML tags
                          );
$java_saver = '/(<SCRIPT[^>]*>[[:space:]]*|.+?<\/SCRIPT>)/i';

# Function to Clean-up #
########################
function script_shutdown() {
        ini_set('max_execution_time', MAX_ET); // Reset the maximum execution time.
        ini_set('max_input_time', MAX_IT); // Reset the maximum input time.
        ob_implicit_flush(0); // Data should be kept in the buffer until ready.
}

register_shutdown_function('script_shutdown'); // Register the shutdown function.

# Modify PHP Settings #
#######################
ini_set('max_execution_time', '14400'); // Make the maximum execution/input time 4 hours so that the script doesn't time-out.
ini_set('max_input_time', '14400');
ini_set('zlib.output_compression_level', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
ob_implicit_flush(1); // Show the progress in the browser.

# Pad with 256 bytes for Internet Explorer to show output immediately.
for ($pad=0; $pad <= 8*256; $pad++) echo "\t";

echo "Stripping the HTML files of excess spaces...
";

# Open the directory and store the file list.
if (is_dir(CATALOG_DIR)) {
        if ($dh = opendir(CATALOG_DIR)) {
                # Iterate over file list.
                while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files.
                        if (strpos($filename,'.htm') !== false)
                                $file_list[] = $filename;
                }
                closedir($dh); // Close the directory.
        }
       
        # Perform specific operations on the files.
        foreach ($file_list as $file) {
                $file_contents = file_get_contents(CATALOG_DIR.DIRECTORY_SEPARATOR.$file);
                $filelength['init'] += strlen($file_contents);
                # Examine document for javascript code blocks and preserve them for restoration.
                if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) {
                        foreach ($got_java as $java_chip) {
                                if (is_array($java_chip)) { // Favorite ice cream! :)
                                        $java_scripts[] = $java_chip[1];
                                }
                        }
                        foreach ($regex as $replace=>$finds) // Do each replacement.
                                $file_contents = preg_replace($finds,$replace,$file_contents);
                        # Reverse the damage to the javascripts.
                        foreach ($java_scripts as $jscript) {
                                foreach ($regex as $replace=>$finds) // Determine what the stripped javascript block looks like.
                                        $stripped_java = preg_replace($finds,$replace,$jscript);
                                # Find the stripped java and replace it with the original code.
                                $file_contents = str_replace($stripped_java,$jscript,$file_contents);
                        }
                } else {
                        foreach ($regex as $replace=>$finds) // Do each replacement.
                                $file_contents = preg_replace($finds,$replace,$file_contents);
                }
                $fp = fopen(CATALOG_DIR.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications.
                if (!fwrite($fp,$file_contents)) {
                        $failure_list[] = $file; // Log failures.
                        $failures++;
                } else {
                        $successes++;
                }
                fclose($fp);
                $filelength['final'] += strlen($file_contents);
                if ($pblr == BAR_LENGTH_REDUCER) { // Progress bar length reducer.
                        echo '|'; // Lengthen the progess bar.
                        $cnt['tmp']++; // Increment the newline counter.
                        $pblr = 0; // Reset pblr counter.
                } else {
                        $pblr++;
                }
                $cnt['tot']++; // Increment totals counter.
                if ($cnt['tmp']==300) { echo '
'; $cnt['tmp']=0; } // Reset the counter.
                }
} else
        die(''.CATALOG_DIR.' is not a directory! Please check the path and try again.');

echo '

There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>';
echo '

Your HTML Catalog files had a total combined length of '.number_format($filelength['init']).' characters.';
echo '
They now have a total length of '.number_format($filelength['final']).' characters.</p>';
echo 'That is a total of <u>'.number_format($filelength['init']-$filelength['final']).'</u> excess white spaces removed from your files.

';

if (isset($failure_list)) {
        echo '
The following files could not be written to:
';
        $c = 'Y'; // Init background color notifier.
        foreach ($failure_list as $fail)
                # Show background color every other line for readability.
                if($c=='N') {$bgb=''; $bge=''; $c='Y';} else {$bgb='<font style="background-color:#E0E0E0">'; $bge='</font>'; $c='N';}
                echo $bgb.''.$fail.$bge.'
';
}
?>
</body>
</html>


adpboss 06-28-2004 12:29 AM

Script still kills Javascript and hangs.

I ran it four times tonight and everytime it ran for more than 15 minutes when eventually I had to stop it.

:(

NuAlpha 06-28-2004 09:39 AM

I am going to be redoing our HTML catalog soon. I will have to work on it then. :?

Not sure what is wrong though.

NuAlpha 06-28-2004 01:25 PM

All fixed! Tested it and it preserves the javascript while stripping the HTML. Doesn't hang anymore on my setup...which was the result of a stupid coding mistake. :P

Code:

<?php ini_set('zlib.output_compression', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems. ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>HTML Catalog Cleaner</title>
</head>
<body>
<?php
######################################################
##            ## HTML Catalog Cleaner ##            ##
######################################################
##                                                  ##
## Strips every file in the HTML catalog directory  ##
## of all excess white spaces.                      ##
######################################################
## Version: 1.1.2 (6/26/2004)
## Last updated: 6/28/2004

# Define the Constants #
########################
define('CATALOG_DIR', '/home/your-site-dir/path-to-xcart/catalog'); // Set the absolute directory path to your catalog.
define('BAR_LENGTH_REDUCER', 3); // If you have over 1000 HTML files in your catalog, you may wish to set this number higher.
# System constants.
define('MAX_ET', ini_get('max_execution_time'));
define('MAX_IT', ini_get('max_input_time'));

# Modify PHP Settings #
#######################
ini_set('max_execution_time', '14400'); // Make the maximum execution & input time 4 hours so that the script doesn't time-out.
ini_set('max_input_time', '14400');
ob_implicit_flush(1); // Show the progress in the browser.

# Initialize variables.
$successes = 0;
$failures = 0;
$filelength['init'] = 0;
$filelength['final'] = 0;
$cnt['tmp'] = 0; // Newline counter
$cnt['tot'] = 0; // Totals counter
$pblr = 0; // Progress bar length reducer variable

# Initialize regular expressions.
$regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs
                          ' '=>'/ +/', // Excess spaces
                          ''=>'/ /i', // Additional space after non-breaking space
                          '><'=>'/> </' // Space between HTML tags
                          );
$java_saver = '/(<script[^>]*>.*?<\/script>)/si';

# Function to Clean-up #
########################
function script_shutdown() {
        ini_set('max_execution_time', MAX_ET); // Reset the maximum execution time.
        ini_set('max_input_time', MAX_IT); // Reset the maximum input time.
        ob_implicit_flush(0); // Data should be kept in the buffer until ready.
}

register_shutdown_function('script_shutdown'); // Register the shutdown function.

# Pad with 256 bytes for Internet Explorer to show output immediately.
for ($pad=0; $pad < 256; $pad++) echo "\t";

echo "Stripping the HTML files of excess spaces...
";

# Open the directory and store the file list.
if (is_dir(CATALOG_DIR)) {
        if ($dh = opendir(CATALOG_DIR)) {
                # Iterate over file list.
                while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files.
                        if (strpos($filename,'.htm') !== false)
                                $file_list[] = $filename;
                }
                closedir($dh); // Close the directory.
        }
       
        # Perform specific operations on the files.
        foreach ($file_list as $file) {
                $file_contents = file_get_contents(CATALOG_DIR.DIRECTORY_SEPARATOR.$file);
                $filelength['init'] += strlen($file_contents);
                # Examine document for javascript code blocks and preserve them for restoration.
                if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) {
                        foreach ($got_java as $java_chip) {
                                if (is_array($java_chip)) {
                                        $java_scripts[] = $java_chip[1];
                                }
                        }
                        foreach ($regex as $replace=>$finds) // Do each replacement.
                                $file_contents = preg_replace($finds,$replace,$file_contents);
                        # Reverse the damage to the javascripts.
                        if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) {
                                foreach ($got_java as $stripped_java) {
                                        if (is_array($stripped_java)) {
                                                # Find the stripped java and replace it with the original code.
                                                $file_contents = str_replace($stripped_java[1],current($java_scripts),$file_contents);
                                                next($java_scripts);
                                        }
                                }
                        }
                } else {
                        foreach ($regex as $replace=>$finds) // Do each replacement.
                                $file_contents = preg_replace($finds,$replace,$file_contents);
                }
                $fp = fopen(CATALOG_DIR.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications.
                if (!fwrite($fp,$file_contents)) {
                        $failure_list[] = $file; // Log failures.
                        $failures++;
                } else {
                        $successes++;
                }
                fclose($fp);
                unset($java_scripts);
                $java_scripts = array();
                $filelength['final'] += strlen($file_contents);
                if ($pblr == BAR_LENGTH_REDUCER) { // Progress bar length reducer.
                        echo '|'; // Lengthen the progess bar.
                        $cnt['tmp']++; // Increment the newline counter.
                        $pblr = 0; // Reset pblr counter.
                } else {
                        $pblr++;
                }
                $cnt['tot']++; // Increment totals counter.
                if ($cnt['tmp']==300) { echo '
'; $cnt['tmp']=0; } // Reset the counter.
                }
} else
        die(''.CATALOG_DIR.' is not a directory! Please check the path and try again.');

echo '

There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>';
echo '

Your HTML Catalog files had a total combined length of '.number_format($filelength['init']).' characters.';
echo '
They now have a total length of '.number_format($filelength['final']).' characters.</p>';
echo 'That is a total of <u>'.number_format($filelength['init']-$filelength['final']).'</u> excess white spaces removed from your files.

';

if (isset($failure_list)) {
        echo '
The following files could not be written to:
';
        $c = 'Y'; // Init background color notifier.
        foreach ($failure_list as $fail) {
                # Show background color every other line for readability.
                if($c=='N') {$bgb=''; $bge=''; $c='Y';} else {$bgb='<font style="background-color:#E0E0E0">'; $bge='</font>'; $c='N';}
                echo $bgb.''.$fail.$bge.'
';
        }
}
?>
</body>
</html>


Let me know if you notice anything else that needs fixing. Enjoy!

adpboss 06-28-2004 03:13 PM

I'll test soon.

Thanks NuAlpha, it's a great mod if we get it working right. :)

NuAlpha 06-28-2004 05:46 PM

Minor update:

Replace the code:
Code:

# Pad with 256 bytes for Internet Explorer to show output immediately.
for ($pad=0; $pad < 256; $pad++) echo "\t";


...with:
Code:

# Pad with 256 bytes for Internet Explorer to show output immediately.
if (strpos($_SERVER['HTTP_USER_AGENT'],'MSIE') !== false)
        for ($pad=0; $pad < 256; $pad++) echo "\t"; echo "\n";


:wink:

NuAlpha 06-29-2004 11:05 AM

Potential bug that needs fixing...

Replace:
Code:

<?php ini_set('zlib.output_compression', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems. ?>


With:
Code:

<?php
        ini_set('zlib.output_compression', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
?>


That terminating PHP tag on the same line as the comment can cause problems.

NuAlpha 06-30-2004 11:20 AM

Just ran this on our latest HTML catalog. Stripped a total of 14,606,179 excess white spaces from all of the catalog files. Javascript was left untouched and everything seems to work great! 8)

adpboss 06-30-2004 07:22 PM

Works with my java pop stuff, it's relatively fast and the script terminates properly with the report at the end.

This includes all of the bug fixes and updates up to the time of this post.

Using version 3.4.14.

GREAT JOB NUALPHA!

Code:

<?php
  ini_set('zlib.output_compression', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>HTML Catalog Cleaner</title>
</head>
<body>
<?php
######################################################
##            ## HTML Catalog Cleaner ##            ##
######################################################
##                                                  ##
## Strips every file in the HTML catalog directory  ##
## of all excess white spaces.                      ##
######################################################
## Version: 1.1.2 (6/26/2004)
## Last updated: 6/28/2004

# Define the Constants #
########################
define('CATALOG_DIR', '/home/your-site-dir/path-to-xcart/catalog'); // Set the absolute directory path to your catalog.
define('BAR_LENGTH_REDUCER', 3); // If you have over 1000 HTML files in your catalog, you may wish to set this number higher.
# System constants.
define('MAX_ET', ini_get('max_execution_time'));
define('MAX_IT', ini_get('max_input_time'));

# Modify PHP Settings #
#######################
ini_set('max_execution_time', '14400'); // Make the maximum execution & input time 4 hours so that the script doesn't time-out.
ini_set('max_input_time', '14400');
ob_implicit_flush(1); // Show the progress in the browser.

# Initialize variables.
$successes = 0;
$failures = 0;
$filelength['init'] = 0;
$filelength['final'] = 0;
$cnt['tmp'] = 0; // Newline counter
$cnt['tot'] = 0; // Totals counter
$pblr = 0; // Progress bar length reducer variable

# Initialize regular expressions.
$regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs
            ' '=>'/ +/', // Excess spaces
            ''=>'/ /i', // Additional space after non-breaking space
            '><'=>'/> </' // Space between HTML tags
            );
$java_saver = '/(<script[^>]*>.*?<\/script>)/si';

# Function to Clean-up #
########################
function script_shutdown() {
  ini_set('max_execution_time', MAX_ET); // Reset the maximum execution time.
  ini_set('max_input_time', MAX_IT); // Reset the maximum input time.
  ob_implicit_flush(0); // Data should be kept in the buffer until ready.
}

register_shutdown_function('script_shutdown'); // Register the shutdown function.

# Pad with 256 bytes for Internet Explorer to show output immediately.
if (strpos($_SERVER['HTTP_USER_AGENT'],'MSIE') !== false)
  for ($pad=0; $pad < 256; $pad++) echo "\t"; echo "\n";

echo "Stripping the HTML files of excess spaces...
";

# Open the directory and store the file list.
if (is_dir(CATALOG_DIR)) {
  if ($dh = opendir(CATALOG_DIR)) {
      # Iterate over file list.
      while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files.
        if (strpos($filename,'.htm') !== false)
            $file_list[] = $filename;
      }
      closedir($dh); // Close the directory.
  }

  # Perform specific operations on the files.
  foreach ($file_list as $file) {
      $file_contents = file_get_contents(CATALOG_DIR.DIRECTORY_SEPARATOR.$file);
      $filelength['init'] += strlen($file_contents);
      # Examine document for javascript code blocks and preserve them for restoration.
      if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) {
        foreach ($got_java as $java_chip) {
            if (is_array($java_chip)) {
              $java_scripts[] = $java_chip[1];
            }
        }
        foreach ($regex as $replace=>$finds) // Do each replacement.
            $file_contents = preg_replace($finds,$replace,$file_contents);
        # Reverse the damage to the javascripts.
        if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) {
            foreach ($got_java as $stripped_java) {
              if (is_array($stripped_java)) {
                  # Find the stripped java and replace it with the original code.
                  $file_contents = str_replace($stripped_java[1],current($java_scripts),$file_contents);
                  next($java_scripts);
              }
            }
        }
      } else {
        foreach ($regex as $replace=>$finds) // Do each replacement.
            $file_contents = preg_replace($finds,$replace,$file_contents);
      }
      $fp = fopen(CATALOG_DIR.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications.
      if (!fwrite($fp,$file_contents)) {
        $failure_list[] = $file; // Log failures.
        $failures++;
      } else {
        $successes++;
      }
      fclose($fp);
      unset($java_scripts);
      $java_scripts = array();
      $filelength['final'] += strlen($file_contents);
      if ($pblr == BAR_LENGTH_REDUCER) { // Progress bar length reducer.
        echo '|'; // Lengthen the progess bar.
        $cnt['tmp']++; // Increment the newline counter.
        $pblr = 0; // Reset pblr counter.
      } else {
        $pblr++;
      }
      $cnt['tot']++; // Increment totals counter.
      if ($cnt['tmp']==300) { echo '
'; $cnt['tmp']=0; } // Reset the counter.
      }
} else
  die(''.CATALOG_DIR.' is not a directory! Please check the path and try again.');

echo '

There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>';
echo '

Your HTML Catalog files had a total combined length of '.number_format($filelength['init']).' characters.';
echo '
They now have a total length of '.number_format($filelength['final']).' characters.</p>';
echo 'That is a total of <u>'.number_format($filelength['init']-$filelength['final']).'</u> excess white spaces removed from your files.

';

if (isset($failure_list)) {
  echo '
The following files could not be written to:
';
  $c = 'Y'; // Init background color notifier.
  foreach ($failure_list as $fail) {
      # Show background color every other line for readability.
      if($c=='N') {$bgb=''; $bge=''; $c='Y';} else {$bgb='<font style="background-color:#E0E0E0">'; $bge='</font>'; $c='N';}
      echo $bgb.''.$fail.$bge.'
';
  }
}
?>
</body>
</html>


jburba2000 08-12-2004 11:14 AM

Anyone wanna tell me why I am getting this error? The only thing I have changed is my directory path???

Quote:

Warning: Unexpected character in input: '\' (ASCII=92) state=1 in /home/admin/sitedir/admin/catalog_cleanup.php on line 2

Parse error: parse error in /home/admin/sitedir/admin/catalog_cleanup.php on line 2

jburba2000 08-12-2004 11:44 AM

never mind, i figured it out, and with a total of 1,111,686 excess white spaces removed from my code.

wow, cudos bro, much appreciated mod, hope i can pay ya back later...

NuAlpha 08-12-2004 12:34 PM

Quote:

Originally Posted by jburba2000
never mind, i figured it out, and with a total of 1,111,686 excess white spaces removed from my code.

wow, cudos bro, much appreciated mod, hope i can pay ya back later...


Welcome! :wink:

john99 08-21-2004 08:12 PM

Quote:

Originally Posted by NuAlpha
Quote:

Originally Posted by jburba2000
never mind, i figured it out, and with a total of 1,111,686 excess white spaces removed from my code.

wow, cudos bro, much appreciated mod, hope i can pay ya back later...


Welcome! :wink:


Hi jburba2000, I hit the same problem and wonder if you could let me know how you fixed it. Thanks.

wallachee 10-10-2004 09:40 PM

Same problem here in 3.5.10...parse error line2.....If you take take out the lines:

<?php
б═ б═ini_set('zlib.output_compression', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
?>


It gives a parse error on line 44....any ideas?

B00MER 10-12-2004 12:25 PM

Smarty has this built-in ;)
Code:

$smarty->load_filter('output','trimwhitespace');
Details:
:arrow: http://smarty.incutio.com/?page=SmartyTips

NuAlpha 10-12-2004 12:34 PM

Quote:

Originally Posted by B00MER
Smarty has this built-in ;)
Code:

$smarty->load_filter('output','trimwhitespace');
Details:
:arrow: http://smarty.incutio.com/?page=SmartyTips


Yep, but it doesn't clear out the white spaces in the HTML catalog. It also won't remove all excess white space just multiples of the same kind. All of our templates are enclosed ing {strip} tags and our pages download as one long line of code. Remove the tags and turn on that output filter, and the code remains multi-line.

As for the problem people are having here with the code I posted, I haven't a clue what could be wrong as I verified the code in Zend Studio and everything checks out fine. The one I have is the same version and runs just fine. :-k

wallachee 10-12-2004 03:20 PM

I figured it out. The problem is that when you copy the file there is a lot of white space before some lines. Take out all of that space, and the code works great. Awesome mod. It stripped 391,200 white spaces from my catalog :-)

-Bradley

wallachee 10-12-2004 04:14 PM

Will this mod have any affect on Google spidering (positive or negative?)

-Bradley

NuAlpha 10-12-2004 04:27 PM

Quote:

Originally Posted by wallachee
Will this mod have any affect on Google spidering (positive or negative?)


Positive in that the pages will download somewhat faster. Other than that I do not know. :)

alpine 11-11-2004 05:31 AM

Compatibal with 4.0.6?
 
Will this work with 4.0.6?

GM 11-12-2004 03:31 PM

I'm gonna' try it on 4.06 (nice job!) 8O

Metal-X-Man 11-24-2004 06:08 PM

X-Cart Version 3.5.4

Mod is very sick! Worked like a clock!

Here's the data:
-----------------------------------------------------
There were 82,069 successful cleanings and 1 failures out of a total of 82,070 files.

Your HTML Catalog files had a total combined length of 2,500,312,154 characters.
They now have a total length of 2,009,804,777 characters.
That is a total of 490,507,377 excess white spaces removed from your files.

-----------------------------------------------------

Never mind the failure - it was garbage in the database. Anyone had more whitespaces removed than I have?

Metal-X

GM 12-02-2004 11:15 PM

Works like a charm! Nice one! :D

adpboss 12-05-2004 11:13 AM

GM,

Could you post your 4.0 code for us please?

markwhoo 12-31-2004 02:19 PM

Quote:

Originally Posted by GM
I'm gonna' try it on 4.06 (nice job!) 8O


So tell me, I see your on vs 4.0.9 now. What did you do to the code to make it work for this version?

Will ADP's last post with all of the corrections made do the trick, or is there more to it than that?

Thanks in advance for the help. :wink:

GM 01-01-2005 09:50 AM

I never changed the original code (NuAlpha revised version) XCart v4.09 and my sight is smokin' ! I'm on a shared server too.But I also applied Boomers Smarty Mod for the flipover.
This is the mod of the year man! 8O

Well.... that's not all... I trimmed out the tags in home.tpl, trashed fancy cats, optimized my graphics, souped up the head.tpl...etc. (Nice Holley four barrel pumper...)

NuAlpha 01-01-2005 10:44 AM

And to think, shortly after writing this mod we decided using the HTML catalog was not feasible because of the number of product pages we have. Had to write some complex PHP and MySQL code distributed throughout Xcart to tie in precisely with mod_rewrite code. It all worked so well (after many migraine related bug fixes) our HTML catalog has since been deleted. :lol:

The only time this code became a server drain was when the Pompos/dir.com search engine bot brutalized our site, sucking more pages per minute than MSN, Google, Google-Media, Froogle, Yahoo, Teoma, etc. combined. They have since been banned.

markwhoo 01-01-2005 06:55 PM

Quote:

Originally Posted by GM
I never changed the original code (NuAlpha revised version) XCart v4.09 and my sight is smokin' ! I'm on a shared server too.But I also applied Boomers Smarty Mod for the flipover.
This is the mod of the year man! 8O

Well.... that's not all... I trimmed out the tags in home.tpl, trashed fancy cats, optimized my graphics, souped up the head.tpl...etc. (Nice Holley four barrel pumper...)


In my cart, I have a few scripts that did not work with the literal tags, so I removed them and they worked. I noticed ADP had a complaint about the script issues up front and then there was a bit of tweaking going on, but if I remember, in the thread I thought I saw a mention of the use of literal tags to keep it working.

With my inability to add the literal tags on the few scripts that didn't work until they were removed, will this Mod work for me without damage to the scripts?

Thanks for the input, I appreciate it.

Stephen Hatton 01-01-2005 11:29 PM

Path Issues in php script
 
Hi NuAlpha

Thanks for the post - it looks good. I'm having problems with setting the absolute path in the Defining Constants.

My site has the following structure to the catalog:
http://www.eotr.com/Light4Life/catalog

I have created the .php file and put it into the directory, but it keeps complaining about the "is not a directory! Please check the path and try again." for all the path methods I have used so far.

Do we need to set special permissions for the php script and the html files?

I have 1200 files with average size of 200Kbytes each. It would be interesting to see what it does to them :D

Regards
Ing. Stephen Hatton
:idea:

NuAlpha 01-02-2005 05:10 PM

Re: Path Issues in php script
 
Quote:

Originally Posted by Stephen Hatton
My site has the following structure to the catalog:
http://www.eotr.com/Light4Life/catalog


You have to enter the absolute file path from root. For instance, most setups would look like-> "/home/user-dir-name/public_html/Light4Life/catalog"

Don't include your domain name or http:// with that directory path. Your absolute file path should start with forward slash, which is root-> "/"

Quote:

Originally Posted by Stephen Hatton
Do we need to set special permissions for the php script and the html files?


Permissions for the catalog directory should be chmod 777.

NuAlpha 01-02-2005 05:13 PM

Quote:

Originally Posted by markwhoo
With my inability to add the literal tags on the few scripts that didn't work until they were removed, will this Mod work for me without damage to the scripts?


The mod I posted is only supposed to be used on the HTML catalog that you generate through Xcart.

markwhoo 01-02-2005 06:05 PM

Quote:

Originally Posted by NuAlpha
Quote:

Originally Posted by markwhoo
With my inability to add the literal tags on the few scripts that didn't work until they were removed, will this Mod work for me without damage to the scripts?


The mod I posted is only supposed to be used on the HTML catalog that you generate through Xcart.


Yes I understand that, yet the javascripts I am refering to are based in the customer home.tpl, where this is used to generate those HTML cat pages.

So, when the catalog generation has been completed, and this mod has run to remove white spaces, the code embedded in the html pages, will it be damaged?

Thanks for the reply

NuAlpha 01-02-2005 09:16 PM

Quote:

Originally Posted by markwhoo
So, when the catalog generation has been completed, and this mod has run to remove white spaces, the code embedded in the html pages, will it be damaged?


If you are refering to the javascript being damaged by striping the whitespaces, then no that code won't be damaged. The javascript is accounted for and allowed to have extra whitespaces.

As for any other non-javascript code encased in {literal} tags, there is no way to prevent spaces from being striped as they are written purely as HTML once the catalog has run, not in Smarty template form.

Is this what you were refering to? Sorry if I over/under explained myself.


All times are GMT -8. The time now is 10:30 AM.

Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.