Archive of articles classified as' "Uncategorized"

Back home

Passing Kerberos TGT (ticket-granting ticket) to remote hosts with ssh

18/11/2009

Kerberos uses tickets to grant access to resources on a Kerberos-enabled computer. If you want to login (via SSH) to a remote host and you don’t want to re-execute the kinit command after you login, you can just forward your ticket with your ssh client. Two steps are needed to do this – given that your Kerberos client is configured:

  1. Create a Kerberos forwardable ticket to your machine
  2. and, forward your ticket while logging in to the remote machine.

In order to create a forwardable ticket execute kinit with the “-f” argument. e.g.:

pythoagoras:~ asteriosk$ kinit -f
Please enter the password for username@domain.com:

In order to tell the ssh client to forward your ticket to the remote machine, you have to configure it accordingly. The easiest way to do it is to include two directives in your ssh client configuration file which is in .ssh/config (create one if there its not there).

chercheurs2-235:~ asteriosk$ more ~/.ssh/config
Host domain.com
        GSSAPIAuthentication yes
        GSSAPIDelegateCredentials yes

Of course, substitute domain.com and username accordingly to match your configuration. This works for both Linux and Mac OS X clients.

No Comments

Google’s support for RDFa and Microformats

19/10/2009

Google has announced that their search engine is going to support enhanced searching in web pages, by using RDFa and Microformats embedded in XHTML. Google states that the extra (structured) data will be used in order to get results for Product Reviews (e.g. CNET Reviews), Products (e.g. Amazon product pages), People (e.g. LinkedIn profiles) and any other types of resources will be made public through the data-vocabulary.org. W3C is pretty happy about that.

The news are good for three reasons:

  1. Google supports an Open Standard (RDFa by W3C) and also an Open RDF Vocabulary
  2. Structured Data embedded in human readable web pages are going to start showing up. Content providers will start using RDFa or Microformats to get Google’s Rich Snippets in search results. Of course other Search Engines will follow :) Yahoo! already supports RDFa in SearchMonkey.
  3. One more big step towards the Semantic Web.

I suspect that data from RDFa or Microformats will be also used in some way by Google for Ranking search results and that a new SEO era is going to start. However it is too early to make predictions.

From my personal point of view, if RDFa is finally going to be widely adopted, it will be the first time that scientists will have a Web-scale distributed, structured data “playground” to do research on. Although I am not an expert in the field, I remember that Semantic Web and Large Graph processing had scalability problems (correct me if I’m wrong!). The Web Graph (that is as simple as “one page links another”) is going to be much more complex and semantically “meaningful”. I am very curious to see what comes next in this direction!

1 Comment

Merging multiple Lucene indexes

31/03/2009

Apache Lucene LogoThis is the code that I  use to merge multiple Lucene indexes into one. There are many reasons to merge multiple indexes into one like:

    • Speed
    • Ease of management
    • Space – the size of the merged index is less than the sum of non-merged indexes

Here is the code of an Index Merger:

package ucy.cs.hpcl.minerSoft.indexmanipulation;

/*This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

Author: Asterios Katsifodimos (http://www.asteriosk.gr)
*/
import java.io.File;
import java.io.IOException;
import java.util.Date;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

public class IndexMerger {

	/** Index all text files under a directory. */
	public static void main(String[] args) {

		if(args.length != 2){
			System.out.println("Usage: java -jar IndexMerger.jar " +
					           "existing_indexes_dir merged_index_dir");
			System.out.println(" existing_indexes_dir: A directory where the " +
					             "indexes that have to merged exist");
			System.out.println("   e.g. indexes/");
			System.out.println("   e.g.         index1");
			System.out.println("   e.g.         index2");
			System.out.println("   e.g.         index3");
			System.out.println(" merged_index_dir: A directory where the merged " +
					                               "index will be stored");
			System.out.println("   e.g. merged_indexes");
			System.exit(1);
		}

		File INDEXES_DIR  = new File(args[0]);
		File INDEX_DIR    = new File(args[1]);

		INDEX_DIR.mkdir();

		Date start = new Date();

		try {
			IndexWriter writer = new IndexWriter(INDEX_DIR,
												new StandardAnalyzer(),
												true);
			writer.setMergeFactor(1000);
			writer.setRAMBufferSizeMB(50);

			Directory indexes[] = new Directory[INDEXES_DIR.list().length];

			for (int i = 0; i < INDEXES_DIR.list().length; i++) {
				System.out.println("Adding: " + INDEXES_DIR.list()[i]);
				indexes[i] = FSDirectory.getDirectory(INDEXES_DIR.getAbsolutePath()
													+ "/" + INDEXES_DIR.list()[i]);
			}

			System.out.print("Merging added indexes...");
			writer.addIndexes(indexes);
			System.out.println("done");

			System.out.print("Optimizing index...");
			writer.optimize();
			writer.close();
			System.out.println("done");

			Date end = new Date();
			System.out.println("It took: "+((end.getTime() - start.getTime()) / 1000)
											+ "\"");

		} catch (IOException e) {
			e.printStackTrace();
		}
	}
}

You can play with these values to gain some more performance. My settings are pretty generic.

writer.setMergeFactor(1000);
writer.setRAMBufferSizeMB(50);

Downloads

Here you can download an Index Merger that takes an argument with the folder containing the indexes to be merged and an output directory where it is going to store the merged indexes.

    • IndexMerger.zip, Contains: a standalone jar file (with Lucene bundled), and the source code of the IndexMerger.

Leave a comment if you find something bad on this code!

6 Comments

Installing MacTex and TeXlipse on Mac OS X

25/02/2009

tex_logoTeXlipse is an Eclipse Plugin that allows you to have control of your Tex files, compile them, and convert them to PDF. It runs through Eclipse and its very easy to use. Using Eclipse as a tool for Tex is nice because you can use all the other Eclipse’s tools to be more productive(CVS, SVN, Mylin etc).

The only requirement to install Texlipse, is to have a Tex toolchain already installed on your Mac. The recommended way to have the whole toolchain, is to install MacTex. In order to install it, just visit http://www.tug.org/mactex/ and install it(the file you are about to download is about 1.2 GBs!).

After you have installed MacTex, you have to install TeXlipse.

Read the rest of this article »

4 Comments

SSH Tunneling to redirect requests from a local port to a remote one

20/02/2009

Suppose that you want to access a remote port in a machine that runs a service on port 3306. Also suppose that the remote machine has restricted access to that port only for requests coming from the host “localhost”. You will have to create a tunnel to that machine and tunnel all your requests from you local computer’s port e.g. 2000 to the remote host’s port 3306.

ssh  -L 2000:localhost:3306  root@asteriosk.gr

After doing this, every request to localhost:2000 will be redirected(tunneled) to the remote machine at port 3306 through a secure channel! The remote machine, will accept all requests coming from the tunnel like if they were coming from localhost.

For me, this was a very nice way to access my MySQL database from my computer with the Sequel Pro client that does not support SOCKS proxies. I tell Sequel to connect to 127.0.0.1:2000 and all the requests that I make, are being redirected to my host (asteriosk.gr) so that MySQL thinks I am a local user and lets me in.

Let me know if there is something not clear here!

3 Comments