Follow us on Twitter X-Cart on Facebook Wiki
Shopping cart software Solutions for online shops and malls

HTML Catalog Cleaner - Removes excess white space

 
Reply
   X-Cart forums > X-Cart 4 > Dev Questions
 
Thread Tools Search this Thread
  #1  
Old 06-26-2004, 04:49 PM
 
NuAlpha NuAlpha is offline
 

X-Adept
  
Join Date: Aug 2003
Location: US
Posts: 598
 

Default HTML Catalog Cleaner - Removes excess white space

This script acts like a one time {strip} tag, removing all unnecessary white spaces from your .html files generated by the HTML Catalog, thus reducing the bandwidth needed for these files.

At the moment, I don't have a .tpl interface written for it, so you have to call it from your browser. This works on your existing catalog, though I am sure you could pull out the regex code and stick it into admin/html_catalog.php if your so inclined.

Here is the code. Put it in whatever directory you wish under whatever name you wish (.php extension of course)...
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>HTML Catalog Cleaner</title> </head> <body> <?php ###################################################### ## ## HTML Catalog Cleaner ## ## ###################################################### ## ## ## Strips every file in the HTML catalog directory ## ## of all excess white spaces. ## ###################################################### ## Version: 1.0.0 (6/26/2004) ## Last updated: 1.0.4 6/26/2004 # Function to Clean-up # ######################## function script_shutdown() { ini_set('max_execution_time', '30'); // Reset the maximum execution time. ini_set('max_input_time', '60'); // Reset the maximum input time. ob_implicit_flush(0); // Data should be kept in the buffer until ready. } register_shutdown_function('script_shutdown'); // Register the shutdown function. # Modify PHP Settings # ####################### ini_set('max_execution_time', '14400'); // Make the maximum execution/input time 4 hours so that the script doesn't time-out. ini_set('max_input_time', '14400'); ini_set('zlib.output_compression_level', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems. ob_implicit_flush(1); // Show the progress in the browser. echo "Stripping the HTML files of excess spaces... "; # Initialize variables. $dir = '/home/your-site-dir/path-to-xcart/catalog'; // Set the absolute directory path. $successes = 0; $failures = 0; $filesize['init'] = 0; $filesize['final'] = 0; $cnt['tmp'] = 0; // Newline counter $cnt['tot'] = 0; // Totals counter $pblr = 0; // Progress bar length reducer # Initialize settings. $regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs ' '=>'/ +/', // Excess spaces ''=>'/ /i', // Additional space after non-breaking space '><'=>'/> </' // Space between HTML tags ); # Open the directory and store the file list. if (is_dir($dir)) { if ($dh = opendir($dir)) { # Iterate over file list. while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files. if (strpos($filename,'.htm') !== false) $file_list[] = $filename; } closedir($dh); // Close the directory. } # Perform specific operations on the files. foreach ($file_list as $file) { if (($fs_tmp = filesize($dir.DIRECTORY_SEPARATOR.$file)) !== false) $filesize['init'] += $fs_tmp; $file_contents = file_get_contents($dir.DIRECTORY_SEPARATOR.$file); foreach ($regex as $replace=>$finds) // Do each replacement. $file_contents = preg_replace($finds,$replace,$file_contents); $fp = fopen($dir.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications. if (!fwrite($fp,$file_contents)) { $failure_list[] = $file; // Log failures. $failures++; } else { $successes++; } fclose($fp); if (($fs_tmp = filesize($dir.DIRECTORY_SEPARATOR.$file)) !== false) $filesize['final'] += $fs_tmp; if ($pblr == 5) { // Modify this number to reduce the length of progress bar in case of a lot of files. echo '|'; // Show progess line. $cnt['tmp']++; // Increment the newline counter. $pblr = 0; // Reset pblr counter. } else { $pblr++; } $cnt['tot']++; // Increment totals counter. if ($cnt['tmp']==300) { echo ' '; $cnt['tmp']=0; } // Reset the counter. } } else die(''.$dir.' is not a directory!'); echo ' There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>'; echo ' Your HTML Catalog files initially occupied '.number_format($filesize['init']).' bytes of disk space.'; echo ' They now occupy '.number_format($filesize['final']).' bytes of disk space.</p>'; if (isset($failure_list)) { echo ' The following files could not be written: '; foreach ($failure_list as $fail) echo ''.$fail.' '; } ?> </body> </html>

Let me know if you find any problems!
__________________
X-Cart Pro 4.5.5 Platinum
X-Payments 1.0.6
PHP 5.3.14
MySQL 5.1.68
Apache 2.2.23
Reply With Quote
  #2  
Old 06-26-2004, 10:28 PM
  adpboss's Avatar 
adpboss adpboss is offline
 

X-Man
  
Join Date: Feb 2003
Location: Ontario, Canada
Posts: 2,389
 

Default

Code:
# Initialize variables. $dir = '/home/your-site-dir/path-to-xcart/catalog'; // Set the absolute
Don't forget to set the path to your html catalog!

I'm testing now!
Reply With Quote
  #3  
Old 06-26-2004, 10:34 PM
  adpboss's Avatar 
adpboss adpboss is offline
 

X-Man
  
Join Date: Feb 2003
Location: Ontario, Canada
Posts: 2,389
 

Default

The code is cleaned up although the reported initial and finished file size for the work is the same.

Pages are lightning quick.

This is a SICK SICK mod. I love it.

I just added it as a link under my HTML Catalog link in the menu admin.

skin1/admin/menu_admin.tpl
Reply With Quote
  #4  
Old 06-26-2004, 10:44 PM
 
NuAlpha NuAlpha is offline
 

X-Adept
  
Join Date: Aug 2003
Location: US
Posts: 598
 

Default

Quote:
Originally Posted by adpboss
The code is cleaned up although the reported initial and finished file size for the work is the same.

I had already done the once over with this mod on our catalog, when at the last second I decided to add the 'filesize difference' code before posting so I never even saw the output.

I am suprised that there was no difference in filesize. I wonder why not....hmmm.
__________________
X-Cart Pro 4.5.5 Platinum
X-Payments 1.0.6
PHP 5.3.14
MySQL 5.1.68
Apache 2.2.23
Reply With Quote
  #5  
Old 06-26-2004, 10:49 PM
 
NuAlpha NuAlpha is offline
 

X-Adept
  
Join Date: Aug 2003
Location: US
Posts: 598
 

Default

By the way adpboss, did you make use of those regular expressions I posted for adding {literal} tags to every space and newline in your plain text emails? I have not gotten a chance to actually test it, but I will be doing so soon as I need plain text labels to print.

Am thinking about writing a customizable mod to allow printing on sheets of Avery style labels.
__________________
X-Cart Pro 4.5.5 Platinum
X-Payments 1.0.6
PHP 5.3.14
MySQL 5.1.68
Apache 2.2.23
Reply With Quote
  #6  
Old 06-26-2004, 11:19 PM
  adpboss's Avatar 
adpboss adpboss is offline
 

X-Man
  
Join Date: Feb 2003
Location: Ontario, Canada
Posts: 2,389
 

Default

Found a bug, it kills this javascript that I have in my product pages that pops up a new window for detailed images.

So I can't use it as-is, but hopefully I'll find time to play with this and get it to work.

Is there a way to PREVENT areas of code from being touched?

Code:
{literal} <SCRIPT language=JavaScript1.2> <!-- var store_language='{/literal}{$store_language}{literal}'; function product_option(name_of_option) { {/literal} for(i=0; i<{$product_options_count|default:"0"}; i++) if (document.orderform[i].name.search(name_of_option) != -1) return document.orderform[i]; return -1; {literal} } function FormValidation() { {/literal} {if $javascript_code} {$javascript_code} {else} return true; {/if} {literal} } --> </SCRIPT>
Reply With Quote
  #7  
Old 06-27-2004, 07:16 AM
 
NuAlpha NuAlpha is offline
 

X-Adept
  
Join Date: Aug 2003
Location: US
Posts: 598
 

Default

Hmmm...I will have to think about this one. Forgot about the javascript.
__________________
X-Cart Pro 4.5.5 Platinum
X-Payments 1.0.6
PHP 5.3.14
MySQL 5.1.68
Apache 2.2.23
Reply With Quote
  #8  
Old 06-27-2004, 09:58 AM
 
NuAlpha NuAlpha is offline
 

X-Adept
  
Join Date: Aug 2003
Location: US
Posts: 598
 

Default

Here is the revised version. This should take care of the javascript problem.
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>HTML Catalog Cleaner</title> </head> <body> <?php ###################################################### ## ## HTML Catalog Cleaner ## ## ###################################################### ## ## ## Strips every file in the HTML catalog directory ## ## of all excess white spaces. ## ###################################################### ## Version: 1.1.0 (6/26/2004) ## Last updated: 6/27/2004 # Define the Constants # ######################## define('CATALOG_DIR', '/home/nulimec/public_html/catalog'); // Set the absolute directory path to your catalog. define('BAR_LENGTH_REDUCER', 3); // If you have over 1000 HTML files in your catalog, you may wish to set this number higher. # System constants. define('MAX_ET', ini_get('max_execution_time')); define('MAX_IT', ini_get('max_input_time')); # Initialize variables. $successes = 0; $failures = 0; $filelength['init'] = 0; $filelength['final'] = 0; $cnt['tmp'] = 0; // Newline counter $cnt['tot'] = 0; // Totals counter $pblr = 0; // Progress bar length reducer variable # Initialize regular expressions. $regex = array(''=>'/[\t\n\r\f]+/', // Newlines and tabs ' '=>'/ +/', // Excess spaces ''=>'/ /i', // Additional space after non-breaking space '><'=>'/> </' // Space between HTML tags ); $java_saver = '/(<SCRIPT[^>]*>[[:space:]]*|.+?<\/SCRIPT>)/i'; # Function to Clean-up # ######################## function script_shutdown() { ini_set('max_execution_time', MAX_ET); // Reset the maximum execution time. ini_set('max_input_time', MAX_IT); // Reset the maximum input time. ob_implicit_flush(0); // Data should be kept in the buffer until ready. } register_shutdown_function('script_shutdown'); // Register the shutdown function. # Modify PHP Settings # ####################### ini_set('max_execution_time', '14400'); // Make the maximum execution/input time 4 hours so that the script doesn't time-out. ini_set('max_input_time', '14400'); ini_set('zlib.output_compression_level', 'Off'); // Turn off zlib compression, if On, to prevent Mozilla output problems. ob_implicit_flush(1); // Show the progress in the browser. # Pad with 256 bytes for Internet Explorer to show output immediately. for ($pad=0; $pad <= 8*256; $pad++) echo "\t"; echo "Stripping the HTML files of excess spaces... "; # Open the directory and store the file list. if (is_dir(CATALOG_DIR)) { if ($dh = opendir(CATALOG_DIR)) { # Iterate over file list. while (($filename = readdir($dh)) !== false) { // Use instead of scandir to skip some files. if (strpos($filename,'.htm') !== false) $file_list[] = $filename; } closedir($dh); // Close the directory. } # Perform specific operations on the files. foreach ($file_list as $file) { $file_contents = file_get_contents(CATALOG_DIR.DIRECTORY_SEPARATOR.$file); $filelength['init'] += strlen($file_contents); # Examine document for javascript code blocks and preserve them for restoration. if (preg_match_all($java_saver,$file_contents,$got_java,PREG_SET_ORDER)) { foreach ($got_java as $java_chip) { if (is_array($java_chip)) { // Favorite ice cream! :) $java_scripts[] = $java_chip[1]; } } foreach ($regex as $replace=>$finds) // Do each replacement. $file_contents = preg_replace($finds,$replace,$file_contents); # Reverse the damage to the javascripts. foreach ($java_scripts as $jscript) { foreach ($regex as $replace=>$finds) // Determine what the stripped javascript block looks like. $stripped_java = preg_replace($finds,$replace,$jscript); # Find the stripped java and replace it with the original code. $file_contents = str_replace($stripped_java,$jscript,$file_contents); } } else { foreach ($regex as $replace=>$finds) // Do each replacement. $file_contents = preg_replace($finds,$replace,$file_contents); } $fp = fopen(CATALOG_DIR.DIRECTORY_SEPARATOR.$file, 'w'); // Truncate file, then apply the modifications. if (!fwrite($fp,$file_contents)) { $failure_list[] = $file; // Log failures. $failures++; } else { $successes++; } fclose($fp); $filelength['final'] += strlen($file_contents); if ($pblr == BAR_LENGTH_REDUCER) { // Progress bar length reducer. echo '|'; // Lengthen the progess bar. $cnt['tmp']++; // Increment the newline counter. $pblr = 0; // Reset pblr counter. } else { $pblr++; } $cnt['tot']++; // Increment totals counter. if ($cnt['tmp']==300) { echo ' '; $cnt['tmp']=0; } // Reset the counter. } } else die(''.CATALOG_DIR.' is not a directory! Please check the path and try again.'); echo ' There were '.number_format($successes).' successful cleanings and '.number_format($failures).' failures out of a total of '.number_format($cnt['tot']).' files.</p>'; echo ' Your HTML Catalog files had a total combined length of '.number_format($filelength['init']).' characters.'; echo ' They now have a total length of '.number_format($filelength['final']).' characters.</p>'; echo 'That is a total of <u>'.number_format($filelength['init']-$filelength['final']).'</u> excess white spaces removed from your files. '; if (isset($failure_list)) { echo ' The following files could not be written to: '; $c = 'Y'; // Init background color notifier. foreach ($failure_list as $fail) # Show background color every other line for readability. if($c=='N') {$bgb=''; $bge=''; $c='Y';} else {$bgb='<font style="background-color:#E0E0E0">'; $bge='</font>'; $c='N';} echo $bgb.''.$fail.$bge.' '; } ?> </body> </html>
__________________
X-Cart Pro 4.5.5 Platinum
X-Payments 1.0.6
PHP 5.3.14
MySQL 5.1.68
Apache 2.2.23
Reply With Quote
  #9  
Old 06-28-2004, 12:29 AM
  adpboss's Avatar 
adpboss adpboss is offline
 

X-Man
  
Join Date: Feb 2003
Location: Ontario, Canada
Posts: 2,389
 

Default

Script still kills Javascript and hangs.

I ran it four times tonight and everytime it ran for more than 15 minutes when eventually I had to stop it.

Reply With Quote
  #10  
Old 06-28-2004, 09:39 AM
 
NuAlpha NuAlpha is offline
 

X-Adept
  
Join Date: Aug 2003
Location: US
Posts: 598
 

Default

I am going to be redoing our HTML catalog soon. I will have to work on it then.

Not sure what is wrong though.
__________________
X-Cart Pro 4.5.5 Platinum
X-Payments 1.0.6
PHP 5.3.14
MySQL 5.1.68
Apache 2.2.23
Reply With Quote
Reply
   X-Cart forums > X-Cart 4 > Dev Questions


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -8. The time now is 12:10 AM.

   

 
X-Cart forums © 2001-2020