X-Cart: shopping cart software

X-Cart forums (https://forum.x-cart.com/index.php)
-   Dev Questions (https://forum.x-cart.com/forumdisplay.php?f=20)
-   -   HTML Catalog Cleaner - Removes excess white space (https://forum.x-cart.com/showthread.php?t=8260)

NuAlpha 06-26-2004 04:49 PM

HTML Catalog Cleaner - Removes excess white space
 
This script acts like a one time {strip} tag, removing all unnecessary white spaces from your .html files generated by the HTML Catalog, thus reducing the bandwidth needed for these files.

At the moment, I don't have a .tpl interface written for it, so you have to call it from your browser. This works on your existing catalog, though I am sure you could pull out the regex code and stick it into admin/html_catalog.php if your so inclined.

Here is the code. Put it in whatever directory you wish under whatever name you wish (.php extension of course)...
Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>HTML Catalog Cleaner</title>
</head>
<body>
<?php
######################################################
##            ## HTML Catalog Cleaner ##            ##
######################################################
##                                                      ##
## Strips every file in the HTML catalog directory  ##
## of all excess white spaces.                      ##
######################################################
## Version: 1.0.0 (6/26/2004)
## Last updated: 1.0.4 6/26/2004

# Function to Clean-up #
########################
function script_shutdown() {
        ini_set('max_execution_time', '30'); // Reset the maximum execution time.
        ini_set('max_input_time', '60'); // Reset the maximum input time.
        ob_implicit_flush(0); // Data should be kept in the buffer until ready.
}

register_shutdown_function('script_shutdown'); // Register the shutdown function.

# Modify PHP Settings #
#######################
ini_set('max_execution_time', '14400'); // Make the maximum execution/input time 4 hours so that the script doesn't time-out.
ini_set('max_input_time', '14400');
ini_set('zlib.output_compression_level', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
ob_implicit_flush(1); // Show the progress in the browser.

echo "Stripping the HTML files of excess spaces...
";

# Initialize variables.
$dir = '/home/your-site-dir/path-to-xcart/catalog'; // Set the absolute directory path.
$successes = 0;
$failures = 0;
$filesize['init'] = 0;
$filesize['final'] = 0;
$cnt['tmp'] = 0; // Newline counter
$cnt['tot'] = 0; // Totals counter
$pblr = 0; // Progress bar length reducer

# Initialize settings.
$regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs
                          ' '=>'/ +/', // Excess spaces
                          ''=>'/ /i', // Additional space after non-breaking space
                          '><'=>'/> </' // Space between HTML tags
                          );

# Open the directory and store the file list.
if (is_dir($dir)) {
        if ($dh = opendir($dir)) {
                # Iterate over file list.
                while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files.
                        if (strpos($filename,'.htm') !== false)
                                $file_list[] = $filename;
                }
                closedir($dh); // Close the directory.
        }
       
        # Perform specific operations on the files.
        foreach ($file_list as $file) {
                if (($fs_tmp = filesize($dir.DIRECTORY_SEPARATOR.$file)) !== false)
                        $filesize['init'] += $fs_tmp;
                $file_contents = file_get_contents($dir.DIRECTORY_SEPARATOR.$file);
                foreach ($regex as $replace=>$finds) // Do each replacement.
                        $file_contents = preg_replace($finds,$replace,$file_contents);
                $fp = fopen($dir.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications.
                if (!fwrite($fp,$file_contents)) {
                        $failure_list[] = $file; // Log failures.
                        $failures++;
                } else {
                        $successes++;
                }
                fclose($fp);
                if (($fs_tmp = filesize($dir.DIRECTORY_SEPARATOR.$file)) !== false)
                        $filesize['final'] += $fs_tmp;
                if ($pblr == 5) { // Modify this number to reduce the length of progress bar in case of a lot of files.
                        echo '|'; // Show progess line.
                        $cnt['tmp']++; // Increment the newline counter.
                        $pblr = 0; // Reset pblr counter.
                } else {
                        $pblr++;
                }
                $cnt['tot']++; // Increment totals counter.
                if ($cnt['tmp']==300) { echo '
'; $cnt['tmp']=0; } // Reset the counter.
                }
} else
        die(''.$dir.' is not a directory!');

echo '

There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>';
echo '

Your HTML Catalog files initially occupied '.number_format($filesize['init']).' bytes of disk space.';
echo '
They now occupy '.number_format($filesize['final']).' bytes of disk space.</p>';

if (isset($failure_list)) {
        echo '
The following files could not be written:
';
        foreach ($failure_list as $fail)
                echo ''.$fail.'
';
}
?>
</body>
</html>


Let me know if you find any problems! :)

adpboss 06-26-2004 10:28 PM

Code:

# Initialize variables.
$dir = '/home/your-site-dir/path-to-xcart/catalog'; // Set the absolute

Don't forget to set the path to your html catalog!

I'm testing now! :-)

adpboss 06-26-2004 10:34 PM

The code is cleaned up although the reported initial and finished file size for the work is the same.

Pages are lightning quick.

This is a SICK SICK mod. I love it.

I just added it as a link under my HTML Catalog link in the menu admin.

skin1/admin/menu_admin.tpl

NuAlpha 06-26-2004 10:44 PM

Quote:

Originally Posted by adpboss
The code is cleaned up although the reported initial and finished file size for the work is the same.


I had already done the once over with this mod on our catalog, when at the last second I decided to add the 'filesize difference' code before posting so I never even saw the output. :lol:

I am suprised that there was no difference in filesize. :? I wonder why not....hmmm. :?:

NuAlpha 06-26-2004 10:49 PM

By the way adpboss, did you make use of those regular expressions I posted for adding {literal} tags to every space and newline in your plain text emails? I have not gotten a chance to actually test it, but I will be doing so soon as I need plain text labels to print.

Am thinking about writing a customizable mod to allow printing on sheets of Avery style labels.

adpboss 06-26-2004 11:19 PM

Found a bug, it kills this javascript that I have in my product pages that pops up a new window for detailed images.

So I can't use it as-is, but hopefully I'll find time to play with this and get it to work. :(

Is there a way to PREVENT areas of code from being touched?

Code:

{literal}
<SCRIPT language=JavaScript1.2>
<!--
var store_language='{/literal}{$store_language}{literal}';

function product_option(name_of_option)
{
{/literal}
for(i=0; i<{$product_options_count|default:"0"}; i++)
  if (document.orderform[i].name.search(name_of_option) != -1)
        return document.orderform[i];
return -1;
{literal}
}

function FormValidation()

{/literal}
{if $javascript_code}
{$javascript_code}
{else}
return true;
{/if}
{literal}
}
-->
</SCRIPT>


NuAlpha 06-27-2004 07:16 AM

Hmmm...I will have to think about this one. Forgot about the javascript. :(

NuAlpha 06-27-2004 09:58 AM

Here is the revised version. This should take care of the javascript problem. :)
Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>HTML Catalog Cleaner</title>
</head>
<body>
<?php
######################################################
##            ## HTML Catalog Cleaner ##            ##
######################################################
##                                                  ##
## Strips every file in the HTML catalog directory  ##
## of all excess white spaces.                      ##
######################################################
## Version: 1.1.0 (6/26/2004)
## Last updated: 6/27/2004

# Define the Constants #
########################
define('CATALOG_DIR', '/home/nulimec/public_html/catalog'); // Set the absolute directory path to your catalog.
define('BAR_LENGTH_REDUCER', 3); // If you have over 1000 HTML files in your catalog, you may wish to set this number higher.
# System constants.
define('MAX_ET', ini_get('max_execution_time'));
define('MAX_IT', ini_get('max_input_time'));

# Initialize variables.
$successes = 0;
$failures = 0;
$filelength['init'] = 0;
$filelength['final'] = 0;
$cnt['tmp'] = 0; // Newline counter
$cnt['tot'] = 0; // Totals counter
$pblr = 0; // Progress bar length reducer variable

# Initialize regular expressions.
$regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs
                          ' '=>'/ +/', // Excess spaces
                          ''=>'/ /i', // Additional space after non-breaking space
                          '><'=>'/> </' // Space between HTML tags
                          );
$java_saver = '/(<SCRIPT[^>]*>[[:space:]]*|.+?<\/SCRIPT>)/i';

# Function to Clean-up #
########################
function script_shutdown() {
        ini_set('max_execution_time', MAX_ET); // Reset the maximum execution time.
        ini_set('max_input_time', MAX_IT); // Reset the maximum input time.
        ob_implicit_flush(0); // Data should be kept in the buffer until ready.
}

register_shutdown_function('script_shutdown'); // Register the shutdown function.

# Modify PHP Settings #
#######################
ini_set('max_execution_time', '14400'); // Make the maximum execution/input time 4 hours so that the script doesn't time-out.
ini_set('max_input_time', '14400');
ini_set('zlib.output_compression_level', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems.
ob_implicit_flush(1); // Show the progress in the browser.

# Pad with 256 bytes for Internet Explorer to show output immediately.
for ($pad=0; $pad <= 8*256; $pad++) echo "\t";

echo "Stripping the HTML files of excess spaces...
";

# Open the directory and store the file list.
if (is_dir(CATALOG_DIR)) {
        if ($dh = opendir(CATALOG_DIR)) {
                # Iterate over file list.
                while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files.
                        if (strpos($filename,'.htm') !== false)
                                $file_list[] = $filename;
                }
                closedir($dh); // Close the directory.
        }
       
        # Perform specific operations on the files.
        foreach ($file_list as $file) {
                $file_contents = file_get_contents(CATALOG_DIR.DIRECTORY_SEPARATOR.$file);
                $filelength['init'] += strlen($file_contents);
                # Examine document for javascript code blocks and preserve them for restoration.
                if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) {
                        foreach ($got_java as $java_chip) {
                                if (is_array($java_chip)) { // Favorite ice cream! :)
                                        $java_scripts[] = $java_chip[1];
                                }
                        }
                        foreach ($regex as $replace=>$finds) // Do each replacement.
                                $file_contents = preg_replace($finds,$replace,$file_contents);
                        # Reverse the damage to the javascripts.
                        foreach ($java_scripts as $jscript) {
                                foreach ($regex as $replace=>$finds) // Determine what the stripped javascript block looks like.
                                        $stripped_java = preg_replace($finds,$replace,$jscript);
                                # Find the stripped java and replace it with the original code.
                                $file_contents = str_replace($stripped_java,$jscript,$file_contents);
                        }
                } else {
                        foreach ($regex as $replace=>$finds) // Do each replacement.
                                $file_contents = preg_replace($finds,$replace,$file_contents);
                }
                $fp = fopen(CATALOG_DIR.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications.
                if (!fwrite($fp,$file_contents)) {
                        $failure_list[] = $file; // Log failures.
                        $failures++;
                } else {
                        $successes++;
                }
                fclose($fp);
                $filelength['final'] += strlen($file_contents);
                if ($pblr == BAR_LENGTH_REDUCER) { // Progress bar length reducer.
                        echo '|'; // Lengthen the progess bar.
                        $cnt['tmp']++; // Increment the newline counter.
                        $pblr = 0; // Reset pblr counter.
                } else {
                        $pblr++;
                }
                $cnt['tot']++; // Increment totals counter.
                if ($cnt['tmp']==300) { echo '
'; $cnt['tmp']=0; } // Reset the counter.
                }
} else
        die(''.CATALOG_DIR.' is not a directory! Please check the path and try again.');

echo '

There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>';
echo '

Your HTML Catalog files had a total combined length of '.number_format($filelength['init']).' characters.';
echo '
They now have a total length of '.number_format($filelength['final']).' characters.</p>';
echo 'That is a total of <u>'.number_format($filelength['init']-$filelength['final']).'</u> excess white spaces removed from your files.

';

if (isset($failure_list)) {
        echo '
The following files could not be written to:
';
        $c = 'Y'; // Init background color notifier.
        foreach ($failure_list as $fail)
                # Show background color every other line for readability.
                if($c=='N') {$bgb=''; $bge=''; $c='Y';} else {$bgb='<font style="background-color:#E0E0E0">'; $bge='</font>'; $c='N';}
                echo $bgb.''.$fail.$bge.'
';
}
?>
</body>
</html>


adpboss 06-28-2004 12:29 AM

Script still kills Javascript and hangs.

I ran it four times tonight and everytime it ran for more than 15 minutes when eventually I had to stop it.

:(

NuAlpha 06-28-2004 09:39 AM

I am going to be redoing our HTML catalog soon. I will have to work on it then. :?

Not sure what is wrong though.


All times are GMT -8. The time now is 11:38 AM.

Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.