Author Topic: File Storage: File System versus Database and technical questions.  (Read 5037 times)

Charles

  • Newbie
  • *
  • Posts: 6
    • View Profile
Hi,

Feng Office has huge potential. It is exceptionally well conceived and executed. But, like any meaningful project emerging from adolescence, the technical details are a bit hidden. No matter. I don't mind asking questions.

I am trying to determine the technical pros and cons of using File System for File Storage versus Database. My first inclination was that files might be retained in their native format in File System and that that would provide some comfort - users could inspect the directory and even fetch documents directly if necessary.

Now I see that the directory structure is difficult to decipher (not a bad thing, just a difficult thing) and I know that files in uploads do not retain their native file form. Fair enough. But, honestly, I'm not at all sure what is actually placed beneath uploads. I uploaded a large (4+MB image) and saw no new directories under uploads result; whereas if I fetch new mail I see plenty.

I went to the mySQL database to see what I might learn there.

I found some entries in the DB in the og_project_files table referencing the five actual files that had been uploaded, and complimentary records in og_project_file_revisions, and in og_project_file_revisions noted with interest a repository_id, for example, of 2ef292272ce1f244ef08bca91275b5b7df087507, similar in structure to what I found in various actual uploads directories. But, a find for that particular directory failed to produce a hit.

While looking at og_file_repo_attributes I found the id matching the example repository_id. While I could find other og_file_repo_attributes ids in the uploads directory, none of the og_project_file_revisions repository_ids were found.

I have spent several hours searching for answers. Maybe you might help me?

1) I saw a slight reference on a page indicating that mySQL was the preferred file repository. Please confirm. We are still early enough in review to changed the file storage mechanism.

2) Why couldn't I find actual uploaded files in the uploads directory; but rather, seemed only to find directories with keys matching the emails and perhaps a few other objects (avitars for example)? Where went the large 4MB image I uploaded that matched the key referred to earlier?

3) Our actual business case will necessarily result in tens of thousands of engineering documents, some quite large. Since those uploads appear to be separated from other files in uploads (somehow, although frankly, it doesn't feel right), if the File Storage is mySQL will it be able to handle large volumes of large files as BLOBS or does it put them outside of the database table?

4) We are likely to look at contributions in either monetary form or of code if and when we adopt. Need some help in support of that objective.

Thank you in advance,
« Last Edit: January 27, 2011, 10:15:54 am by Charles »

supadoctor

  • Jr. Member
  • **
  • Posts: 99
    • View Profile
    • Email
Re: File Storate: File System versus Database and technical questions.
« Reply #1 on: January 27, 2011, 05:49:28 am »
Not answers but some notes about "file system" way.

Each version of file has link to repository via repository_id field. This id is determines folder's structure in "upload" folder:
ROOT
-> "upload"
     -> first three character of repository_id
          -> 4-6 character of repository_id
               -> 7-9 character of repository_id
                    -> target file in original format, name of the file is last characters of id

Charles

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: File Storate: File System versus Database and technical questions.
« Reply #2 on: January 27, 2011, 10:30:49 am »
Not answers but some notes about "file system" way.

Each version of file has link to repository via repository_id field. This id is determines folder's structure in "upload" folder:
ROOT
-> "upload"
     -> first three character of repository_id
          -> 4-6 character of repository_id
               -> 7-9 character of repository_id
                    -> target file in original format, name of the file is last characters of id


Thanks for demystifying the repository id.

franponce87

  • Administrator
  • Hero Member
  • *****
  • Posts: 1819
    • View Profile
    • Email
Re: File Storage: File System versus Database and technical questions.
« Reply #3 on: February 16, 2011, 01:31:01 pm »
Hi Charles, welcome to Feng Office Forums!
It is great to know you have liked Feng Office that much.
Sorry for the late reply, even though I have already noticed that supadoctor has already replied long ago, answering your main question.

1) I saw a slight reference on a page indicating that mySQL was the preferred file repository. Please confirm. We are still early enough in review to changed the file storage mechanism.

2) Why couldn't I find actual uploaded files in the uploads directory; but rather, seemed only to find directories with keys matching the emails and perhaps a few other objects (avitars for example)? Where went the large 4MB image I uploaded that matched the key referred to earlier?

3) Our actual business case will necessarily result in tens of thousands of engineering documents, some quite large. Since those uploads appear to be separated from other files in uploads (somehow, although frankly, it doesn't feel right), if the File Storage is mySQL will it be able to handle large volumes of large files as BLOBS or does it put them outside of the database table?

4) We are likely to look at contributions in either monetary form or of code if and when we adopt. Need some help in support of that objective.


1- Yes, we do recommend using filesystem
2- Because of supadoctor's reply
3- We also use a file repository not to compromise the database performance among other reasons, but either way, certain information of the files will be stored in different tables of course.
4- I did not understand quite well what did you mean with this


Best regards,
Francisco
Would you like to install Feng Office Professional or Enterprise Edition in your servers? No problem! Read this article!

Charles

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: File Storage: File System versus Database and technical questions.
« Reply #4 on: February 16, 2011, 04:23:24 pm »
Hi Charles, welcome to Feng Office Forums!
It is great to know you have liked Feng Office that much.
Sorry for the late reply, even though I have already noticed that supadoctor has already replied long ago, answering your main question.

1) I saw a slight reference on a page indicating that mySQL was the preferred file repository. Please confirm. We are still early enough in review to changed the file storage mechanism.

2) Why couldn't I find actual uploaded files in the uploads directory; but rather, seemed only to find directories with keys matching the emails and perhaps a few other objects (avitars for example)? Where went the large 4MB image I uploaded that matched the key referred to earlier?

3) Our actual business case will necessarily result in tens of thousands of engineering documents, some quite large. Since those uploads appear to be separated from other files in uploads (somehow, although frankly, it doesn't feel right), if the File Storage is mySQL will it be able to handle large volumes of large files as BLOBS or does it put them outside of the database table?

4) We are likely to look at contributions in either monetary form or of code if and when we adopt. Need some help in support of that objective.


1- Yes, we do recommend using filesystem
2- Because of supadoctor's reply
3- We also use a file repository not to compromise the database performance among other reasons, but either way, certain information of the files will be stored in different tables of course.
4- I did not understand quite well what did you mean with this


Best regards,
Francisco

Hello Francisco,

1) Regarding File Storage:

I had read that Feng Office recommended database (can't find that source currently). I wondered about database limitations and confirmed that current Linux variations, including Ubuntu 10.04 deploys with an EXT4 files system which supports mySQL database sizes up to 4TB. On the other hand, the only reason to store file content in the database is if one absolutely needs the security model that the database provides. Due to how Feng Office stores plain text emails in the file system storage, someone could grep sensitive information with ease. Since storage space is not a limiting factor and security is paramount, we switched to database storage. Admittedly, blob data  must be escaped and and un-escaped but that alone appears to be just a performance issue easily solved with hardware. There may be a search issue with blob data as well, but that doesn't seem to be the case.

2) Regarding supadoctor's reply

He said he didn't really answer the question, just told me how the directory structure is encoded relative to the document.

3) Of course, metadata is stored in the database along with keys regardless of storage mechanism. I'm not sure why the file storage mechanism was so elaborately constructed to make casual human browsing so difficult. Complexity is not security.

4) Monetary Contributions:

I see a great deal of potential for FO in our business case (discussed in some respects below). We would be committing some vital internal process to FO and want to be assured that the project has a significant future.

We also develop, but do not want to make customizations and wind up with an orphaned version. We would like to have good rapport and communication with FO staff to make sure our planned contributions are not over-lapping other FO development initiatives and to make sure they are deemed worthy enough by FO to assure incorporation into future FO releases.

We have current issues arising from lack of technical documentation and would like like to pay for support to get some of these technical issues answered quickly - wading through code to find answers is much more expensive than paying someone who already knows the answers, and, most of the forum dialogue fails to deal with what I consider to be significant issues (one example - what's up with Child Objects (they seem to be a wonderful concept that has not yet reached a functional level in code)?

Our Current Business Case Involves Document Control for all engineering and other documentation associated with custom engineered skid mounted equipment that we build - ours is an engineer to order business.

The key concept currently revolves around utilizing Workspaces and Permissions to enforce document control processes.

This is a subtle, but very powerful capability which appears largely not discussed. Beyond version control and activity logging, Workspaces in conjunction with Permissions may effectively govern the actual migration of documentation through various departments and gates. Below is the business case briefly outlined:

Document Release and Notification

No document is official unless uploaded.

o   We can use workspace permissions to greatly simplify document release procedures.

§  Owner Co/Client Interaction example:

·          An Engineer uploads a document (For Review, Approved, Certified for Construction, whatever) into the Project’s “Pending Release” workspace:

o   All internal users would have access to that workspace.

o   Project Manager and other Department Managers and Participating Client agents would have access to a “Released to Client” workspace, but other internal users would not.

§  The Project Manager would simply drag/move a document he wanted to release from the “Pending Release” workspace to the “Released to Client” workspace.

·         If a client had access to that workspace and subscribed to that workspace, they would get an automatic notification of the workspace activity and could download documents at their convenience (downloading would not alter the existence of the Released Document in the workspace).

·         If a client was not a user of Feng Office, the Project Manager would drag/copy them to another “Transmit to Client” workspace to which Document Control subscribes.

·         Document Control would get an automatic alert, and then would

o   Email the documents to the client from directly within Feng Office; or,

o   Upload to the client’s site; or,

o   Deliver hard-copy via mail or delivery service; or,

o   Do all  or some combination of the above.

·         Document control activities in this process would all be automatically logged and visible to the Project Manager.

§  Owner Co/Owner Co interaction example:


·         An engineer uploads a “Certified for Construction” document into a “Released to Project” workspace:

o   Engineering and Project Management have access to the “Released to Project” workspace, but not Quality Assurance or Shop Management.

·         Project Manager would be “subscribed” to the “Released to Project” workspace and would get an automatic notification that a new document was uploaded by Engineering.

o   The Project Manager would simply drag/move a document he wanted released to Quality Assurance from the “Released to Project” workspace into the “Certified for Construction” workspace.

§   Project Managers and Quality Control would have access to the “Certified for Construction” workspace, but neither Engineering, nor the Shop Managers would.

§  The Quality Assurance Dept would be “subscribed” to the “Certified for Construction” workspace and would get an automatic notification that the Project Manager had released a document for their review and action.

o   The Quality Assurance Dept would review the documents:

§  If acceptable:

·         QA would release them to the Shop by dragging/moving them to the “Released for Construction” workspace.

·         Quality Assurance and Shop Managers would have access to a “Released for Construction” workspace, but Engineers and Project Managers would not.

·         The Shop Manager would be subscribed to the “Released for Construction” workspace and would get an automatic notification that the Quality Assurance Dept had releases a document to the Released for Construction” workspace.

§  If not acceptable:

·         QA would drag/move them to the “Rejected Documents” workspace commenting document with the reason for rejection

o   Project Management and Engineering would be “subscribed” to the “Rejected Documents” workspace and would get automatic notification of all rejected documents put into the “Rejected Documents” workspace.

o   Throughout the process, all activities are logged and activity is presented to each user according to their permissions.

 
Activity Logging (Chain of Custody)

o   All activity is logged automatically.

§  No one will be able to “touch” a document without the system logging their activities.

o   Documents may be either “check-out” or downloaded.

§  If someone “checks-out a document:

·         Only one user at a time may check-out a document and the system identifies which user has the document checked out.

·         If someone checks out a document, and alters it, they check it back in when they upload the revision.

·         The software enforces the requirement that to check in a checked-out document, the user must upload the same document and annotate the changes (they could just put garbage characters in the field, but the system would record the user who did it.)

§  If someone downloads a document, but doesn’t “check it out”:

·         If someone simply downloads a document, as opposed to checking it out, that’s fine.

·         If they make any changes the altered document is considered a new document and must be uploaded.

Francesco,

I hope I have not overly burdened you with so much explanation. Hopefully you will find sufficient merit to engage in a meaningful, and mutually beneficial relationship.

Kindest regards,

Charles

franponce87

  • Administrator
  • Hero Member
  • *****
  • Posts: 1819
    • View Profile
    • Email
Re: File Storage: File System versus Database and technical questions.
« Reply #5 on: February 21, 2011, 02:05:14 pm »
Dear Charles,

Regarding Feng Office, let me tell you we are growing, improving and fixing things quite fast, and we plan to keep on this way.. so yes, the project does have a significant future.

The use cases you mention seem to be doable (not sure what do you consider the 'Document Control' as.. if it is a user or automated process.. if the lattest, this is not possible yet, but it could be arranged to do so).

Either way, do keep in mind we do offer what we call an onSite service and we also make custom developments for certain clients. I would suggest you to get in touch with sales@fengoffice.com, but I will also tell them to contact you as well to the email address you have used to register into the forum. This way you will be able to find out more about Feng Office, its roadmap and others.

Best regards,
Francisco
Would you like to install Feng Office Professional or Enterprise Edition in your servers? No problem! Read this article!

Baveskara

  • Hero Member
  • *****
  • Posts: 648
    • ICQ Messenger - 365673748
    • Yahoo Instant Messenger - cialis and beer
    • View Profile
    • generic cialis 5 mg from india
    • Email
how much does cialis cost in the philippines
« Reply #6 on: July 27, 2022, 06:13:41 am »
Generic name Methenamine CF HIPPURATE 1g generic finasteride international
« Last Edit: December 16, 2022, 11:40:07 am by Baveskara »

Baveskara

  • Hero Member
  • *****
  • Posts: 648
    • ICQ Messenger - 365673748
    • Yahoo Instant Messenger - cialis and beer
    • View Profile
    • generic cialis 5 mg from india
    • Email
is it safe to take 20mg of cialis
« Reply #7 on: January 28, 2023, 01:30:20 am »
priligy seratonin Analysis was conducted according to recent reports and every study was assessed by the same investigator according to Cochrane risk of bias method 15