News Items that were on the OpenSourceToday site were mostly rewrites of official company [1.] press releases. These Items were by their nature were highly volatile (read as rapidly prone to rot), hence, were inappropriate for longer term retention. Moreover, whatever marginal gain they offered meant prompt display was a necessity. Therefore, rapid, dynamic updating of the News page was an inherently sensible. I describe here one way to meet those needs.
I am limiting my efforts required and the space consumed, by citing the last article showing summary article listings that had similar needs. Moreover, much of the processes is nearly identical. I really begin the new discussion where the requirements diverge. Therefore, my goal is to limit the repetition and limit the size of this piece while supplying sufficient material that the process can be guide that can run on another production system.
I think the best way to start is to compare the pages, noting where they are similar and where they differ:
Figure 1. Open Source Today's Home Page Left and Central Columns
Concentrating on the central column, remember the reversed date order. It's the same on the News page, however, date-time ordering would be the better description:
Figure 2. Full News Page
Yes the most recent news item is at the top, however, there is no posting of the date-time stamp. What I call to your attention is the large point size, bold font titles that lead into varying length descriptive text [2.]. Thus, we can see a similar technique could be employed as in the article listing, but some alterations have to be made in the scripting.
The easiest way to obtain the necessary knowledge is to read the previous article beginning at "Starting Assumptions", and read (scan the code) into the section on "Build Article Data String". Yes there are real differences, however, so much is so similar it is a waste of time to duplicate what is easily accessible.
[For those dubious of my suggestion: I still have the same requirements, assumptions, processes and techniques, e.g. the storage questions are not resolved either for the short term or permanent means. Nonetheless, I assume the data I need is readily available and identifiable when verified to be good [3.]. In addition, I continue to push the read file function as my preferred means to render web content. There is more, however, I think it is time to explore where News items deviate from the article listing summaries.]
I mentioned in one of the links listed under footnote 3, that the News item input should not include all the input fields. Indeed, the author's name is just there for tracking and confirmation purposes. The summary and the keywords should be empty. Moreover, I think the file name is also superfluous, since I doubt I would need it for this application. Remember this content is ephemeral, where long term storage is not desirable.
The content on the News page also differs from the article listings. As abbreviated form of the News page source is shown below:
<div id="central-col-prodnews">
<h2 id="central-text">Open Source Security \
and Data Center Consolidation become \
Key Opportunities for Resellers</h2>
<p>A study by the Federal Open Source Alliance \
has revealed that the Federal Government has \
an increasing appetite for open source. ...</p>
<p>Similar to the state government market, the \
Federal agencies data center consolidation is \
... download at \
<a href="www.federalopensourcealliance.com">\
this web site</a>.</p>
<p>Surveys of the Department of Defense (DoD), \
... </p>
<h2 id="central-text">Analysts Report says Linux \
Losing Market Share - What's New?</h2>
<p>IDC's latest report sees Linux ...</p>
...
<h2 id="central-text">SYSGO - An Embedded Linux \
with real-time capabilities</h2>
<p>If you're looking for an embedded Linux partner\
... </p>
...
</div> <!-- End of central Product News column -->
Listing 1. News Page Content
Some of the differences are there are no floating div tags holding the content. In addition, the News items are more akin to an article, however, albeit with less structure. Each News item has a significant sized heading for the title, followed by a random number of paragraphs that could hold nearly any type content. The article summaries used a set string with mostly fixed characteristics that was wedged between the custom div tags that filled the central column of the OpenSourceToday home page. Here we have items separated by a few empty lines, at most. These summaries filled the page, one following the other.
Due to the differences in content, the News item requires revise code from that shown in the previous article. The revision adds a loop to load its content into the stand aside temporary file. For convenience here is how it was done for the article summaries, i.e. that required only three lines of text:
<?php
$temp_store = $text . "/article-list\.txt\.tmp-hold";
$filehandle = fopen($temp_store, "w");
// write in the three lines of content
$line_1 = $div_line . "\n";
fwrite($filehandle, $line_1);
$line_2 = "<p>" . $article_det_str . "<p>\n";
fwrite($filehandle, $line_2);
$line_3 = $div_end . "\n";
fclose($filehandle);
?>
Listing 2. Writing in New Article Summary
The code above will not suffice, we need to write in a more complex set of text. However, first there has to be an agreement with the poster(s) on whether s/he (or they) will (or not) consistently include the html tags. Obviously the code would differ, hence, whatever the choice it applies to all. Nonetheless, whatever the case, I intend to ignore the possibility that html code might have to be added. The title should be straight forward, in separate storage and identifiable, hence, prefacing the title with <h2> and ending with </h2> is easily accomplished. However, in the body, the new text line would be prefaced with <p> tab, that part is easy. Placing the paragraph ending tag would have to look at the next line to see if it was empty or the end of file to slap the end paragraph tag and a new line. A bit messy details that would tend to detract attention from the major elements in the code. That is my reasoning for not considering it here.
I noted above we need to add the title followed by a random number of paragraphs, which upon reaching the end of the storage file might add a few, empty paragraph opening and closing to space the news items. We would obviously have to loop through the content, line by line and writing each line of text into the temporary file. This process would look more like the append content step in the previously cited article summary listing than the one above. Here is an outline that covers the major requirements to file the temporary file with the new News item:
<?php
// open temp file as write
$temp_store = $text . "/news-list\.txt\.tmp-store";
$filehandle = fopen($temp_store, "w");
// open stored new News item in read mode
$new_news = $text . "/stored-new-news\.txt";
$filehandle2 = fopen($new_news, "r");
// loop through content and write into tmp-store
while (! feof($filehandle2)) {
$read_line = fgets($filehandle2);
fwrite($filehandle, $read_line);
}
// close tmp-store, balance appended
fclose($filehandle);
// no longer needed
fclose($filehandle2);
...
Listing 3. Loading Newest News Item
in temporary file
The next step where the active file that contained all the News items summaries are appended into this new tmp-store file. The difference is slight from the code shown above, in this instance the tmp-store file is opened in append mode and the current News item file is opened as read:
// open temp file as append
$temp_store = $text . "/news-list\.txt\.tmp-store";
$filehandle = fopen($temp_store, "r");
// open stored current News item in read mode
$news_list = $text . "/current-news\.txt";
$filehandle2 = fopen($news_list, "r");
// loop structure the same as above with
// differing files
...
Listing 4. Loading Remaining News List Items
into temporary file
There is really no difference here other than the content and the file names from what was shown in the article summary listing cited so many times previously. Again the next steps after closing those files above are the making of a backup copy of the existing news item list and the final copying of the temporary version into the older listing. Once safely ensconced in its new identity, it will be read off from the template when next rendered with the newest news item included as the lead item at the top.
Meaning these commands are run in the php script code. First, make a backup copy of the older news item list. Second, copy the created temporary complete listing over the older version using the same file naming:
copy($news_list, $news_list_bkup);
// backup copy
copy($temp_store, $news_list);
// if there were no failures
// the process is complete
What has not been duplicated this time, is the frequent reference to add error testing code and the explicit need to inform the maintainers when an error has occurred via email. I also suggest an attempt to recover by copying the backup into the working file. Read the previous article to see those suggestions.
Basically there is very little difference in the code applied to create the updated article summaries on the home page and the updated News item listing on its separate page. The latter is more complex and on average larger, however, other than the initial loading of the temporary file, the code has the same form with the same potential for failures. News items, a priori might have a marginally higher failure probability, but really not significantly different. Both should work with a high level of confidence, however, error catching code should be included.
One very significant portion of the News item listing has not been automated with the displayed techniques. As mentioned in the introduction, these news items that in many cases were to be sprung off of company press releases have dubious foundations. I have supplied no automated method to hide or remove content. Therefore, if this were to be a routine part of the OpenSourceToday site, those too old items would have had to be removed by hand.
I will show a better, simpler method to accomplish the same task, plus the ability to automatically discard the older material. We shall revisit this automation task to see how taking a different view simplifies the effort while making it less prone to error. Moreover, it's less process intensive with most of the vulnerable file processing steps removed. I cannot promise the complete absence of exposure to potential error, however, the likelihood and the consequences are lessened.
My ruminations on the partial automation of the OpenSourceToday site will end with a brief discussion on how some of the data inputs might be stored while awaiting vetting. These thought experiments have served a constructive purpose, but beyond this point there is too little return to warrant further effort. I will continue to think about data storage and ways to render larger pages that are now statically stored as separate html files. However, my focus will be upon my current site and its needs. This change of focus may provide a clearer picture of the true requirements. Nonetheless, even with both these advantages, it does not preclude the likelihood of a drawn out process. Therefore, I will make no prediction on when those discussions will appear nor can I promise at the final result will have arrived with no missteps.
Corrections, suggested extension or comments write: H. Cohen.
© Herschel Cohen, All Rights Reserved
____________________________________________________________________
1. Supposedly from companies interested or active in Open
Source. Return
2. To see full page, use this linked example. Return
3. Look at the various discussions on the security questions
linked to taking external data from html forms. Here are a few
that appeared on LXer.com.
Secure Web Input - Data Analysis
Web Input - Securing Data, First Level of Defense
Web Input - Securing Data, Second Level of Defense
Web Input - Securing Data, Hybrid Approach
Return
____________________________________________________________________