Nutch in Eclipse – Community Service

26 Nov

Just follow the link @


The Robotic Ontology Topic Assig2

23 Oct

For some reason the turtle file provided in blackboard did not open in protege, but i was able to see it as an .xml file and went further to recreate the ontology using paint.

I the field of semantic web every ontology that is created or had been created in the past, has room for improvement. Technology changes and the ways on how we collect data too. In this case (robotics ontology), as Joseph and Brian mentioned in the document, the ontology can be improved by adding jena so future searches can be more accurate and useful. On the other hand, speed can make your application run slower such in this particular case. It all depends on what are we willing to risk when creating the ontology.

In the robotic ontology, we can see that some of the classes can be eliminated or further expanded; one example is the micro-controllers and mechanics classes under components, or the sensor interfacing tutorial as well as the communication hardware tutorials classes under the tutorial class. For the rest, I find it very interesting,  and like the way the ontology was created while taken into account how they were going to arrange the classes and sub-classes. Information is everything, so details are important when finding better ways for gathering it.

Ingredients of an Intelligent Web Topic Assig1

10 Oct

The ingredients of an intelligent web (defined in the book Algorithms of the Intelligent Web) as shown in figure 1.1 are:

  • Aggregated content
    The content is defined as a large amount of data pertinent to a specific application. This is dynamic, geographically dispersed, and is associated or linked to other pieces of information.
  • Reference structures
    The structures provide one or more structural and semantic interpretation of the content. Some example are tags for annotating content in a dynamic way and constant updating. Three types of structure are: dictionaries, knowledge bases, and ontologies.
  • Algorithms
    This refers to the layer of modules that allows applications to harness the information, which is hidden in the data, and use it for the purpose of abstraction, prediction, and improved interaction with its users.

Now that we have defined what are the 3 ingredients of an intelligent web, our task was to identify them in the following domains:

  • Social networking sites
    • Content: as the books refers, the content is readily available in form of video, audio, and postings.
    • Reference: as in the case of my space, the content is categorized in links or tags in different groups.
    • Algorithms: this is present whenever the user is recommended a friend, video, image, etc based on the user’s preferences.
  • Mashups
    • Content: the content is obtained from external data sources (borrowed).
    • Reference: as most of the pages on the web, classification is done into groups, tags, or links.
    • Algorithms: in order to display information for viewing purposes, algorithms are necessary for grabbing similar content.
  • Portals
    • Content: is gathered from the internet or intranet.
    • Reference: the information is displayed in categories under a common heading such as business, health, world, sci/tech, etc.
    • Algorithms: content is grouped automatically or semi automatically based on a reference structure.
  • Wikis
    • Content: content have a predefine structure and is geographically dispersed.
    • Reference: the information been displayed is categorized, in addition to links that refer to articles of the same subject.
    • Algorithms:
  • Media-sharing sites
    • Content: the files and information is gathered from external sources.
    • Reference: the content is categorized and also sub categorized as in the form of hierarchy.
    • Algorithms:

Protege: can we create model(s), do queries, print results, etc? Assig4

8 Oct

As seen in previous lectures and in addition to the tools assignments on how to use Protege, we can affirm that YES we can! Protege is an open source ontology editor for the Web Ontology Language that supports the OWL and Frame modeling; furthermore, it allow us to create, query, reason, and manipulate OWL models. As for the types of files, Protege can handle formats such as RDF, OWL, and XML Schema. Now, In my opinion, how we create a model in Jena Vs. Protege is probably the same but the way in which we access it is not. In Jena, a model will be accessed by writing in a .java file containing such and such:

String uri = "";

//alternatively, you can specify a local path on your computer
//for the travel.owl ontology. If you do this, it is more robust
//to go through the File object to get the URI instead of writing it by hand
// Example:
//String uri = new File("c:/Work/Projects/travel.owl").toURI();

OWLModel owlModel = ProtegeOWL.createJenaOWLModelFromURI(uri);

Collection classes = owlModel.getUserDefinedOWLNamedClasses();
for (Iterator it = classes.iterator(); it.hasNext();) {
    OWLNamedClass cls = (OWLNamedClass);
    Collection instances = cls.getInstances(false);
    System.out.println("Class " + cls.getBrowserText() + " (" + instances.size() + ")");
    for (Iterator jt = instances.iterator(); jt.hasNext();) {
        OWLIndividual individual = (OWLIndividual);
        System.out.println(" - " + individual.getBrowserText());

More information on the above code can be found here:

On the other hand, Protege is not much about coding but instead is more graphical user interface where users can load and save OWL and RDF ontologies, edit and visualize classes and properties, define logical classes characteristics as OWL expressions, execute reasoners such as description logic classifiers, and edit OWL individuals for semantic web markup.


FOAF Model & SPARQL Assig3

25 Sep

After researching and trying several attempts to make this learning experience enjoyable, the results are not what we expected. On the other hand and thanks to one of our classmate, there is finally a way around to solve the issue. The problem we were experienced was due to an .rdf file that was been generated by the F OAF-a-Matic web page. For some reason, that file was missing the schema; therefore, when opening such file in protege 4.0 there were some errors and the information been displayed was incomplete. In the following post, there are some steps you can follow to make it work so eventually we can try some query using SPARQL.

  • If desired, after generating the code you can paste it in notepad and save it as an .rdf file. Another alternative is by going to and download a free 30-day trial text editor that can be used to save the file and also to check if the ontology is well-formed. This is how the ontology looks before making any changes to the data.
  • Because the file is missing some data, lets add the following: rdf ID:”xxxxx”, where the xxxxx represents the any name or number you want to write on it. Here is a before and after sample.


<foaf:name>Diana Barrios</foaf:name>
<foaf:mbox rdf:resource=””/>


<foaf:Person rdf:ID=”05″>
<foaf:name>Diana Barrios</foaf:name>
<foaf:mbox rdf:resource=””/>

  • Now lets combine these 2 files into 1.
    1. Open Protege_4.1
    2. Open OWL Ontology and select the file created with FOAF-a-Matic
    3. Import the file downloaded from by going down in the protege screen–> under imported ontologies click on the + sign next to where it says Direct Imports and finish with the wizard. you should see now on your screen the following: 0.1(
    4. Now in protege, go to Refactor–>Merge Ontologies. Next, select the files you saved before(FOAF-a-Matic and Schema) on the steps before and click continue. Eventually you will see many screens asking you to save the file as(make sure is an RDF/XML), rename the file or create a new one(optional), etc, until you your done with the wizard.
    5. The last step in Protege_4.1 and for future help, lets save the file again by going to File–> Save as. Save the file in both format(RDF/XML and Turtle) in any location you want with any name desirable to you.

Finally, open the file that was merged in Protege_4.1 using Protege_3.4.7 and start querying. Here are some examples:

Google’s Recipe Site (Ontology) Assig2

13 Sep

Here is a useful link for Google’s Recipe site:

and this is how it looks in XML:

My FOAF File Assig2

11 Sep

<foaf:PersonalProfileDocument rdf:about=””>
<foaf:maker rdf:resource=”#me”/>
<foaf:primaryTopic rdf:resource=”#me”/>
<admin:generatorAgent rdf:resource=””/&gt;
<admin:errorReportsTo rdf:resource=””/>
<foaf:Person rdf:ID=”me”>
<foaf:name>David Ochoa</foaf:name>
<foaf:homepage rdf:resource=””/&gt;
<foaf:depiction rdf:resource=”N/A”/>
<foaf:phone rdf:resource=”tel:000-000-0000″/>
<foaf:workplaceHomepage rdf:resource=””/&gt;
<foaf:workInfoHomepage rdf:resource=”Blogging”/>
<foaf:schoolHomepage rdf:resource=””/&gt;
<foaf:name>Eduardo Pajaro</foaf:name>
<foaf:name>Andres Mendez</foaf:name>
<foaf:name>Jhon Campo</foaf:name>