Start line:  
End line:  

Snippet Preview

Snippet HTML Code

Stack Overflow Questions
Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
 
 
 package org.apache.hadoop.hbase.io;
 
 
 
The FileLink is a sort of hardlink, that allows access to a file given a set of locations.

The Problem:

  • HDFS doesn't have support for hardlinks, and this make impossible to referencing the same data blocks using different names.
  • HBase store files in one location (e.g. table/region/family/) and when the file is not needed anymore (e.g. compaction, region deletion, ...) moves it to an archive directory.
If we want to create a reference to a file, we need to remember that it can be in its original location or in the archive folder. The FileLink class tries to abstract this concept and given a set of locations it is able to switch between them making this operation transparent for the user. HFileLink is a more concrete implementation of the FileLink.

Back-references: To help the org.apache.hadoop.hbase.master.cleaner.CleanerChore to keep track of the links to a particular file, during the FileLink creation, a new file is placed inside a back-reference directory. There's one back-reference directory for each file that has links, and in the directory there's one file per link.

HFileLink Example

  • /hbase/table/region-x/cf/file-k (Original File)
  • /hbase/table-cloned/region-y/cf/file-k.region-x.table (HFileLink to the original file)
  • /hbase/table-2nd-cloned/region-z/cf/file-k.region-x.table (HFileLink to the original file)
  • /hbase/.archive/table/region-x/.links-file-k/region-y.table-cloned (Back-reference to the link in table-cloned)
  • /hbase/.archive/table/region-x/.links-file-k/region-z.table-2nd-cloned (Back-reference to the link in table-2nd-cloned)
 
 public class FileLink {
   private static final Log LOG = LogFactory.getLog(FileLink.class);

  
Define the Back-reference directory name prefix: .links-<hfile>/
 
   public static final String BACK_REFERENCES_DIRECTORY_PREFIX = ".links-";

  
FileLink InputStream that handles the switch between the original path and the alternative locations, when the file is moved.
 
   private static class FileLinkInputStream extends InputStream
      implements SeekablePositionedReadable {
    private FSDataInputStream in = null;
    private Path currentPath = null;
    private long pos = 0;
    private final FileLink fileLink;
    private final int bufferSize;
    private final FileSystem fs;
    public FileLinkInputStream(final FileSystem fsfinal FileLink fileLink)
        throws IOException {
      this(fsfileLink, FSUtils.getDefaultBufferSize(fs));
    }
    public FileLinkInputStream(final FileSystem fsfinal FileLink fileLinkint bufferSize)
        throws IOException {
      this. = bufferSize;
      this. = fileLink;
      this. = fs;
      this. = tryOpen();
    }
    @Override
    public int read() throws IOException {
      int res;
      try {
        res = .read();
      } catch (FileNotFoundException e) {
        res = tryOpen().read();
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        res = tryOpen().read();
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        res = tryOpen().read();
      }
      if (res > 0)  += 1;
      return res;
    }
    @Override
    public int read(byte b[]) throws IOException {
       return read(b, 0, b.length);
    }
    @Override
    public int read(byte b[], int offint lenthrows IOException {
      int n;
      try {
        n = .read(bofflen);
      } catch (FileNotFoundException e) {
        n = tryOpen().read(bofflen);
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        n = tryOpen().read(bofflen);
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        n = tryOpen().read(bofflen);
      }
      if (n > 0)  += n;
      assert(.getPos() == );
      return n;
    }
    @Override
    public int read(long positionbyte[] bufferint offsetint lengththrows IOException {
      int n;
      try {
        n = .read(positionbufferoffsetlength);
      } catch (FileNotFoundException e) {
        n = tryOpen().read(positionbufferoffsetlength);
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        n = tryOpen().read(positionbufferoffsetlength);
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        n = tryOpen().read(positionbufferoffsetlength);
      }
      return n;
    }
    @Override
    public void readFully(long positionbyte[] bufferthrows IOException {
      readFully(positionbuffer, 0, buffer.length);
    }
    @Override
    public void readFully(long positionbyte[] bufferint offsetint lengththrows IOException {
      try {
        .readFully(positionbufferoffsetlength);
      } catch (FileNotFoundException e) {
        tryOpen().readFully(positionbufferoffsetlength);
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        tryOpen().readFully(positionbufferoffsetlength);
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        tryOpen().readFully(positionbufferoffsetlength);
      }
    }
    @Override
    public long skip(long nthrows IOException {
      long skipped;
      try {
        skipped = .skip(n);
      } catch (FileNotFoundException e) {
        skipped = tryOpen().skip(n);
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        skipped = tryOpen().skip(n);
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        skipped = tryOpen().skip(n);
      }
      if (skipped > 0)  += skipped;
      return skipped;
    }
    @Override
    public int available() throws IOException {
      try {
        return .available();
      } catch (FileNotFoundException e) {
        return tryOpen().available();
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        return tryOpen().available();
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        return tryOpen().available();
      }
    }
    @Override
    public void seek(long posthrows IOException {
      try {
        .seek(pos);
      } catch (FileNotFoundException e) {
        tryOpen().seek(pos);
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        tryOpen().seek(pos);
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        tryOpen().seek(pos);
      }
      this. = pos;
    }
    @Override
    public long getPos() throws IOException {
      return ;
    }
    @Override
    public boolean seekToNewSource(long targetPosthrows IOException {
      boolean res;
      try {
        res = .seekToNewSource(targetPos);
      } catch (FileNotFoundException e) {
        res = tryOpen().seekToNewSource(targetPos);
      } catch (NullPointerException e) { // HDFS 1.x - DFSInputStream.getBlockAt()
        res = tryOpen().seekToNewSource(targetPos);
      } catch (AssertionError e) { // assert in HDFS 1.x - DFSInputStream.getBlockAt()
        res = tryOpen().seekToNewSource(targetPos);
      }
      if (res = targetPos;
      return res;
    }
    @Override
    public void close() throws IOException {
      .close();
    }
    @Override
    public synchronized void mark(int readlimit) {
    }
    @Override
    public synchronized void reset() throws IOException {
      throw new IOException("mark/reset not supported");
    }
    @Override
    public boolean markSupported() {
      return false;
    }

    
Try to open the file from one of the available locations.

Returns:
FSDataInputStream stream of the opened file link
Throws:
java.io.IOException on unexpected error, or file not found.
    private FSDataInputStream tryOpen() throws IOException {
      for (Path path.getLocations()) {
        if (path.equals()) continue;
        try {
           = .open(path);
          if ( != 0) .seek();
          assert(.getPos() == ) : "Link unable to seek to the right position=" + ;
          if (.isTraceEnabled()) {
            if ( == null) {
              .debug("link open path=" + path);
            } else {
              .trace("link switch from path=" +  + " to path=" + path);
            }
          }
           = path;
          return();
        } catch (FileNotFoundException e) {
          // Try another file location
        }
      }
      throw new FileNotFoundException("Unable to open link: " + );
    }
  }
  private Path[] locations = null;
  protected FileLink() {
    this. = null;
  }

  

Parameters:
originPath Original location of the file to link
alternativePaths Alternative locations to look for the linked file
  public FileLink(Path originPathPath... alternativePaths) {
    setLocations(originPathalternativePaths);
  }

  

Parameters:
locations locations to look for the linked file
  public FileLink(final Collection<Pathlocations) {
    this. = locations.toArray(new Path[locations.size()]);
  }

  

Returns:
the locations to look for the linked file.
  public Path[] getLocations() {
    return ;
  }
  public String toString() {
    StringBuilder str = new StringBuilder(getClass().getName());
    str.append(" locations=[");
    for (int i = 0; i < .; ++i) {
      if (i > 0) str.append(", ");
      str.append([i].toString());
    }
    str.append("]");
    return str.toString();
  }

  

Returns:
true if the file pointed by the link exists
  public boolean exists(final FileSystem fsthrows IOException {
    for (int i = 0; i < .; ++i) {
      if (fs.exists([i])) {
        return true;
      }
    }
    return false;
  }

  

Returns:
the path of the first available link.
  public Path getAvailablePath(FileSystem fsthrows IOException {
    for (int i = 0; i < .; ++i) {
      if (fs.exists([i])) {
        return [i];
      }
    }
    throw new FileNotFoundException("Unable to open link: " + this);
  }

  
Get the FileStatus of the referenced file.

Parameters:
fs org.apache.hadoop.fs.FileSystem on which to get the file status
Returns:
InputStream for the hfile link.
Throws:
java.io.IOException on unexpected error.
  public FileStatus getFileStatus(FileSystem fsthrows IOException {
    for (int i = 0; i < .; ++i) {
      try {
        return fs.getFileStatus([i]);
      } catch (FileNotFoundException e) {
        // Try another file location
      }
    }
    throw new FileNotFoundException("Unable to open link: " + this);
  }

  
Open the FileLink for read.

It uses a wrapper of FSDataInputStream that is agnostic to the location of the file, even if the file switches between locations.

Parameters:
fs org.apache.hadoop.fs.FileSystem on which to open the FileLink
Returns:
InputStream for reading the file link.
Throws:
java.io.IOException on unexpected error.
  public FSDataInputStream open(final FileSystem fsthrows IOException {
    return new FSDataInputStream(new FileLinkInputStream(fsthis));
  }

  
Open the FileLink for read.

It uses a wrapper of FSDataInputStream that is agnostic to the location of the file, even if the file switches between locations.

Parameters:
fs org.apache.hadoop.fs.FileSystem on which to open the FileLink
bufferSize the size of the buffer to be used.
Returns:
InputStream for reading the file link.
Throws:
java.io.IOException on unexpected error.
  public FSDataInputStream open(final FileSystem fsint bufferSizethrows IOException {
    return new FSDataInputStream(new FileLinkInputStream(fsthisbufferSize));
  }

  
NOTE: This method must be used only in the constructor! It creates a List with the specified locations for the link.
  protected void setLocations(Path originPathPath... alternativePaths) {
    assert this. == null : "Link locations already set";
    this. = new Path[1 + alternativePaths.length];
    this.[0] = originPath;
    System.arraycopy(alternativePaths, 0, this., 1, alternativePaths.length);
  }

  
Get the directory to store the link back references

To simplify the reference count process, during the FileLink creation a back-reference is added to the back-reference directory of the specified file.

Parameters:
storeDir Root directory for the link reference folder
fileName File Name with links
Returns:
Path for the link back references.
  public static Path getBackReferencesDir(final Path storeDirfinal String fileName) {
    return new Path(storeDir + fileName);
  }

  
Get the referenced file name from the reference link directory path.

Parameters:
dirPath Link references directory path
Returns:
Name of the file referenced
  public static String getBackReferenceFileName(final Path dirPath) {
  }

  
Checks if the specified directory path is a back reference links folder.

Parameters:
dirPath Directory path to verify
Returns:
True if the specified directory is a link references folder
  public static boolean isBackReferencesDir(final Path dirPath) {
    if (dirPath == nullreturn false;
  }
New to GrepCode? Check out our FAQ X